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STRUCTURAL ASPECTS OF SYSTEM IDENTIFICATION 


by 


Keith Glover 


Submitted to the Department of Electrical Engineering on August 10, 1973 
in partial fulfillment of the requirements for the Degreee of Doctor of 
Philosophy. 


ABSTRACT 


The problem of identifying linear dynamical systems is studied- The 
approach taken is to consider structural and deterministic properties 
of linear systems that have an impact on stochastic identification 
algorithms- In particular we consider the parametrization of linear 
systems so that the identification problem is well-posed. (i.e. there 
is a unique solution and all systems in appropriate class can be re- 
presented) . Firstly canonical forms for the matrix triple (A,B ,C) 

T -i -1 , 

under the transformation group, (A,B ,C) + (TAT ,TB,CT ), (where 
T e GL(n)) are discussed and it is shown that numerical difficulties 
can occur. Then an alternate set of parametrizations which do not 
have these difficulties are given with an associated realization 
algorithm. It is then assumed that a parametrization of the system 
matrices has been established from a priori knowledge of the system, 
and the question is considered of when the unknown parameters of this 
system can be identified from input/output observations. It is assumed 
that the transfer function can be asymptotically identified, and the 
conditions are derived for the local, global and partial identif lability 
of the parametrization. Then it is shown that, with the right formu- 
lation identifiability in the presence of feedback can be treated in 
the same way. Similarly the identifiability of parametrizations of 
systems driven by unobserved white noise is considered using the 
results from the theory of spectral factorization. Finally the pro- 
blems associated with parametrizations admitting multiple representa- 
tions of nonminimal systems are explored. This leads to a study of 
the geometrical properties of minimal and nonminimal systems (e.g. 
the codimension of the set of nonminimal systems in the parameter space) 


THESIS SUPERVISOR: Professor Jan C. Willems 



ACKNOWLEDGEMENTS 


It is with great pleasure that I acknowledge the following 
individuals and organizations: 

Professor Jan C. Willems (thesis supervisor) for his stimulating 
supervision and encouragement. His theoretical insight has been invaluable. 

Professor San joy K. Mitter (thesis reader) for many helpful 
discussions and his patient interest in my work. 

Professors Michael Athans and Fred C. Schweppe (thesis readers) 
for the comments and practical insight they contributed. 

Professor Roger W. Brockett of Harvard for several stimulating 
discussions and original ideas. 

My friends and colleagues in the Electronic Systems Laboratory 
and elsewhere in M.I.T. for many beneficial and enjoyable discussions. 

Ms. Linda Kowach for deciphering and typing the manuscript. 

Mr. Arthur Giodani for the drafting. 

Finally my wife for her tolerance and encouragement. 

X am indebted to the Kennedy Memorial Trust for supporting me 
during the first two years of my graduate studies, and the Electrical 
Engineering Department at M.I.T. for support during the third year as 
a teaching assistant. 

This research was carried out in the Electronic Systems Labor- 
atory with full support for the final year extended by NASA Ames Research 
Center under grant NGL-22-009-124 . 


- 3 - 



TABLE OF CONTENTS 


ABSTRACT 

ACKNOWLEDGEMENTS 
TABLE OF CONTENTS 


NOTATION 




CHAPTER 

1: 

INTRODUCTION 

CHAPTER 

2: 

CANONICAL FORMS FOR IDENTIFICATION 



2 . 1 

Introduction 



2.2 

Popov's Canonical Form 



2.3 

Other Canonical Forms 



2.4 

Remarks on Using Canonical Forms 
for Identification 



2.5 

A New Realization Algorithm 

CHAPTER 

3: 

PARAMETER IDENTIFIABILITY FROM INPUT/OUTPUT 
OBSERVATIONS 



3.1 

Introduction 



3.2 

Local Identifiability 



3.3 

Global Identifiability 



3.4 

Partial Identifiability 



3.5 

Identifiability in the Presence of Feedback 



3.6 

Comparison with the Information Matrix 
and Sensitivity Analysis 

CHAPTER 

4: 

IDENTIFIABILITY FROM OUTPUT CORRELATION 



4.1 

Introduction 



4.2 

Continuous Time Systems 



4.3 

Discrete Time Systems 



4,4 

Identifiability from Output Observation 

-4- 


PAGE 

2 

3 

4 
6 
7 

12 

12 

18 

22 

31 

37 

43 

43 

46 

63 

73 

77 

80 

84 

84 

84 

90 


95 



-5- 

PAGE 

CHAPTER 5; GEOMETRICAL PROPERTIES OF MINIMAL SYSTEMS 99 

5.1 Introduction 99 

5.2 Single Input/Single Output Systems 101 

5.3 Multi-input/Multi-output Systems 109 

5.4 Implications of Theorems 5.1 and 5.2 113 

5.5 Simulation Results 12° 

CHAPTER 6: CONCLUSIONS l 21 

B IBLIOGRAPH Y 124 

APPENDIX I: KRONECKER PRODUCTS 127 

BIOGRAPHIC NOTE 129 



-6- 


NOTATION 


I 

n 


0 

n,m 
GL (n) 


nxn identity matrix 
nxm zero matrix 

general linear group = {t e R nxn |det T 0} 


R 

<D 


n 


<*,*> 


«(■) 

R(*) 


N e (x) 


n-dimensional Euclidean space 
complex plane 

A’ denotes the transpose of the matrix A 
inner product 
null space 
range space 


= {x e X 


|x - x| [ < e}, i.e* an £ -neighborhood 


0(h k ) 


6(x) 


6 . . 

13 


k-times continuously differentiable functions 

satisfies lim °- ( - h -^- = 0 

satisfies h ^ Q ^ 

n 


Euclidean norm 
magnitude 


Dirac delta function 
Kronecker delta = 1 ^ 


i=D 

i^j 


© 


Kronecker product 



CHAPTER 1 


INTRODUCTION 

In order to apply the considerable advances of modern control 
theory it is required to have an accurate system model. Indeed in 
situations where accurate models exist (e.g. many aerospace problems) 
practical applications of modern control theory have been very success- 
ful. However in many applications accurate system models are not known 
a priori but must be deduced from observations of the system in oper- 
ation. This is the so-called identification problem. The lack of 
adequate system models is perhaps the greatest single obstacle to 
applying modern control and filtering techniques, especially now that 
applications are being attempted in areas other than aerospace where 
system models are less we 11 -under stood (e.g. chemical processes, 
power systems, socio-economic systems) . Identification is not only 
of use for subsequent control and filtering, but is often an end in 
itself/ for example to determine whether a new piece of machinery is 
performing to specification, or to check the condition of an operating 
machine . 

In this thesis we are solely concerned with the identification of 
linear systems, since they are a widely applicable class of systems 
and lend themselves to a tractable mathematical treatment. System 
identification algorithms can be thought of as being of two types, 
namely off-line and on-line. Off-line algorithms are generally given 
a finite set of input-output data and from this data give estimates 
of the system parameters. On the other hand, on-line algorithms receive 
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input-output data pairs and update their parameter estimate after each 

additional data pair, under the restriction that the complexity of 

the algorithm does not increase with time . An on-line algorithm is 

therefore a restriction of an off-line algorithm but has the advantage 

that it can observe a system over an arbitrarily long time interval 

with an essentially constant computational effort per unit time. 

The more popular on-line algorithms are, stochastic approximation 

(Albert and Gardner (1967), Tsypkin (1972)), least squares (Sstrom 

and Eykhoff (1971) ), and model reference (Whitaker (1958)) , and perhaps 

o 

the most famous off-line technique is Astrom's maximum likelihood 
o 

method (Astrom and Bohlin (1966)) together with some correlation 
techniques (Mehra (1971)) and instrumental variable methods (Wong and 
Polak (1967)). The division into on- and off-line techniques is some- 
what arbitrary in that it depends on the implementation, and essentially 

similar algorithms may be implemented both on- and off-line. An ex- 

o 

cellent survey of identification has been given by Astrom and Eykhoff 
(1971) and contains some 230 references to which the reader is referred 
for further background material. 

Desirable properties of identification algorithms are: 

(N = number of sample points) 

i) unbiased parameter estimates as N 00 

ii) efficient parameter estimates (i.e., the error covariance 
is close to the theoretical minimum) 

iii) limited computational requirements at each N. (i.e., fast 
convergence if an it tive scheme is used) 

Property (i) is fairly jtial for any schemes but is not 

in fact satisfied for the classical least squares method in all but the 



-9- 


o 

most elementary systems (see Astrom and Eykhoff (1971)). Properties 
(ii) and (iii) are generally mutually exclusive, with for example 
the maximum likelihood method satisfying (ii) but not (iii) , and 
stochastic approximation method satisfying (iii) but not (ii) . 

Essentially any identification algorithm can be considered as 
the minimization (by some numerical method) of some cost function 
(which depends on the system parameters and the observation) over the 
unknown system parameters. Such algorithms have two aspects, firstly 
a stochastic aspect which depends entirely on the choice of cost function 
and will determine properties (i) and (ii) above, and secondly a deter- 
ministic aspect that determines (iii) above and indeed whether there is 
a unique solution to the minimization problem. The study of the sto- 
chastic aspects is the area where most work on identification has been 
done, and this essentially involves producing new cost functions that 
have superior properties. This problem has been studied extensively 
in the statistics, econometrics and time-series analysis literature 
(see for example Box and Jenkins (1970)) . Much of this work begins 
with scalar difference (or differential) equations representing the 
system, and does not take a state space point of view. 

For scalar input or scalar output systems many of the structural 
problems of linear systems are not manifest because there is a natural 
parametrization (i.e. standard controllable or standard observable 
form) . However for multivariable systems there are significant 
parametrization problems and the study of the parametrization of linear 
systems forms the main body of this research. Firstly suppose one 
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wishes to model an unknown system with a state space model of a certain 
dimension, then what parametrization is appropriate for identification? 
Clearly an arbitrary parametrization, for example the system matrices 
being completely free, may not be suitable because there are many 
distinct state space realizations of a particular input/output response. 
This question is considered in Chapter 2 where canonical forms are 
discussed and their shortcomings in this context are examined. A pre- 
ferable set of parametrizations for identification are then given with 
an associated realization algorithm. 

When standard parametrizations are used there is no immediate 
physical interpretation of the states or system parameters, whereas 
in many applications state equations can be derived where there is 
a natural interpretation of the states and coefficients but some of 
the numerical values of the coefficients will be unknown. In such 
cases the system matrices will be parametrized by the unknown para- 
meters, and a natural question is whether a particular set of unknown 
parameters can be identified. This problem is considered in Chapter 3 
where local, global and partial identif iability and identif iability in 
the presence of feedback, given input/output observations are considered 
and straightforward conditions are derived. 

When a system is driven by an unobserved white noise process 
and the output is observed then the identification problem is more 
difficult. This is the spectral factorization problem which has its 
origins with Wiener and Masani (1958) . Chapter 4 gives some background 
material and derives conditions for local identif iability under these 


conditions . 
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In Chapter 5 the problems associated with nonminimal systems are 
considered. Most system parametrizations admit multiple representations 
of nonminimal systems and hence at such systems the identification 
problem does not have a unique solution. Therefore the geometrical 
properties of the minimal and nonminimal systems in the parameter 
space is important and is considered in Chapter 5 , where it is shown 
that at least for single input/single output systems difficulties 
with nonminimal systems are likely to be encountered. 



CHAPTER 2 


CANONICAL FORMS FOR IDENTIFICATION 
2.1 Introduction 

In this chapter we first examine the need in identification for 
parametrizations of linear dynamical systems. Then two examples of 
parametrizations which are canonical forms are given, including some 
new results on these. Some disadvantages of using canonical forms are 
then given by way of an illustrative example, and an alternate para- 
metrization is proposed which avoids these difficulties. Finally a 
new realization algorithm is given which is computationally efficient, 
numerically robust and gives the resulting matrices in a nice form. 

Consider the linear continuous or discrete time dynamical 

systems , 

dx (t) 

— = Ax(t) + Bu(t) , y (t) = Cx ( t) 

x(k+l) » Ax(k) + Bu(k) , y (k) = Cx(k) 

where x e R n , ue K yeR^. 

Suppose that it is desired to identify such a system from 
(possibly noisy) observations of u and y. Assume that only the input/ 
output properties of this system can be identified. For example, 

i) The Markov parameters, = C A k B, k = 0,1,2,.. 

ii) The transfer function, G(s) = C(ls - A) _1 B. 

However, in many applications it is desired to identify a state 
space realization of a system, so that modern state space design and 
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filtering techniques can be applied directly* It will be shown in 
Chapter 3 that in applications with sufficient a priori information a 
natural state space realization is available from physical considerations* 
However in the present chapter we will assume that little or no a priori 
information is available, except perhaps the McMillan degree (or order) 
of the system (see Brockett (1970) ) * 

A knowledge of the Markov parameters or the transfer function 
of a system does not induce a unique state space realizations * (Finding 
a state space realization of a given input/output response is called 
the realization problem, see Ho and Kalman (1970) ) * Indeed all minimal 
realizations of a particular transfer function are related by a simi- 
larity transformation, T, as follows: 

Fact 2,1 (see Brockett (1970)). 

If the triples (A^B^CJ, i = 1,2, represent controllable and 

observable systems (i.e. minimal) then 

C^IS - A 1 )" 1 B 1 = C 2 (IS - A 2 )_1b 2' V s E ® • 


if and only if there exists T £ GL (n) such that 


T » lT - A., 


T b 1 ■ b 2 


C l T ‘ l = C 2 


Therefore there are infinitely many equivalent realizations of 
a particular input/output response. Hence in any identification al- 
gorithm which would be minimizing some external cost function over the 
system parameters, if the complete A ,B and C matrices are left as free 
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parameters there will not be a unique solution. This implies that these 
minimization algorithms could become ill-posed and in any case they 
cannot be expected to converge to the same solution for different con- 
ditions. Therefore it is necessary to restrict the system matrices 
to a subset so that there is a unique solution, for example the A,B and 
C matrices could be parametrized in some way, where. 

Definition 2.1 : A parametrization of a topological space S is a C 1 

mapping from fi CR q into S. B 

Canonical forms are one possible solution to the uniqueness 
problem, and are now introduced. 

The relation E given by (A^ ,B^ ,C^) E (A^ ,C^) if there exists 

T e GL(n) such that, 

T - A 2 , T B 1 = B 2 , CjT -1 , C 2 , 

is an equivalence relation, which is identical to equivalence of the 

transfer functions if the systems are minimal, (Fact 2.1), but for 
nonminimal systems equivalence of transfer functions does not necessarily 
imply E-equivalence. The set of systems equivalent to a particular 
system is termed an orbit in the parameter space (A,B,C). A canonical 
form is then a subset of the parameter space which intersects each orbit 
exactly once, or more precisely; (see MacLane and Birkhoff (1967))* 
Definition 2.2 : Let E be an equivalence relation on the set S, then 

a canonical form for S under E is a subset CCS such that 
1) V s £ S there exists c £ C such that s E c. 

and 2) e C, with c 1 E c 2 — > = c . ® 
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The choice of the set S for the present problem is of importance 
and there are four natural candidates. Firstly S could be all matrix 
triples (A,B,C) which are both controllable and observable, but this 
set is very difficult to parametrize (see Chapter 5) . Secondly S could 
be the arbitrary matrices (A,B,C) , but although this is a well-behaved 
space it introduces unnecessary complications which are avoided if one 
chooses S to be the set of controllable (or observable) matrix triples 
(A,B ,C) . This latter set is in fact chosen since it includes all the 
minimal systems, in which we are primarily interested, and is technically 
most tractable. 

Canonical forms may therefore be particularly useful in iden- 
tification since they overcome the nonuniqueness problems of realization 
theory. To illustrate these points we now give some simple examples. 
Example 2,1 

Consider the scalar system (a,b f c) , n = m = p = 1, with b ^ 0 

/V —1^ 

(i.e. controllable). Then the orbits are given by b = t b, c = t c, 
a = a, with t f 0 (see Figure 2.1) . A canonical form for this case 
is a £ R , c £ R , b = 1 and corresponds to the vertical line given 
in Figure 2.1. It will be noted that the canonical form intersects 
each orbit once and no orbit twice, as required. E3 

Note that the canonical form in Example 2.1 is a very simple 
subset of the parameter space, that this is not always the case is 
illustrated in the following example. 

Example 2 . 2 


Consider the system n=l, m=p=2, (a, (b , b^) 



) , 




with (b^, b^) ? ( 0 , 0). Now the orbits are given by 
a = constant 

(b l' b 2 } = t(b 1 ' b 2 ) 

, where t £ R and t ^ 0. 

In this case a canonical form is given by the set. 




The projections of the orbits into (b^, b^) space are given in 


Figure 2,2* 
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Note that this canonical form is not a connected subset of the 
parameter space. Although a connected subset which is a canonical form 
is possible, for example the half circle given by, 
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Example 2.1 is a special case of the standard controllable form 
for single input systems, i.e. 

0 
0 


0 

1 

with a e R n . 

However as illustrated by example 2.2 such a nice canonical 
form is not always possible. In general for the multi input/multi output 
systems there does not exist a single parametrization of the system 
matrices that is also a canonical form under the similarity transformation. 
Canonical forms that consist of a family of parametrizations have been 
derived by many authors (e.g. Popov (1972) , Mayne (1972) , Luenberger 
(1967)), with probably the nicest derivation given by Popov, which is 

described in the next section. Related results for transfer function are 
in Rosenbrock (1970) . 

2.2 Popov's Canonical Form 

In this section we summarize the canonical form for multi- 
variable linear dynamical systems given implicitly in Popov (1972) . 

The canonical forms rely on finding a complete set of independent in- 
variants for the pair (A,B) , assumed controllable, under the trans- 

T -1 

formation (A,B) ** (T A T , T B) . An invariant is a property of a 
system which does not change under the transformation. Completeness 
of a set of invariants means that the set of invariants for any 
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particular system is sufficient to specify its orbit • Independence 
means that for every set of values for the invariants there exists a 
system with these invariants* 

Consider the matrix, 

W == [b / b * *b f Ab _ g Ab A b , ^ b,**,A b ] 

I z m 1 ^ i ^ m 

(where b = i th column of B) , which will have rank n by the controllability 
assumption* 

th 

Definition 2*3 : The i Kronecker invariant (or index ) , n^ t is the 

n. 

smallest positive integer such that the vector A is a linear 

k 

combination of its antecedants (i*e* vectors A b, such that 

1 

km + j < n^m + j) . S3 

It can be shown that the set of vectors, 

n i” 1 n m~ 1 

P — [b . f Ab_ , • • • r A b 1 ^ « ,b , Ab , . ■ . « ,A b ] 

II 1 mm m 

which are called regular by Popov r are independent and are in fact 
the first set of n independent vectors that occur in the matrix W, 
when moving from left to right* Clearly, 

n * . . + n = n. 

12 m 

It is also shown that every non-regular vector is a linear 
combination of its regular antecedants. The following theorem is the 
main result in Popov (1972) . 

Theorem 2.2: A complete set of independent invariants for the pair 

T -i 

(A,B) , under the transformation (A,B) ^ (T A T , T B) where T e GL(n) , 
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are the Kronecker invariants, (n^, n 2 ' • * • • ' n m ^ • anii the real numbers 

a defined implicitely (but uniquely) as follows? 
iD* 



i-1 min(n . , 

l l 1 


n.-l) 

3 


j=l k=0 



b. 

3 


+ l l 

j=i k=0 



b, 

3 


Q 


Now using the above set of invariants a canonical form can be 


derived as follows. 

Corollary 2 .3 : A canonical form for the controllable pair (A,B) under 

T -1 

the transformation (A,B) + (T A T , T B) is given by the following 
family of parametrizations one for each set of indices, (n^, n 2 **** n m ^' 

such that n. + n_ ... +- n = n and n > 0, Vi. 

12 in i — 


p. 



P— 


A 11 

A 12 *•' A lm 


B 11 

B, „ ... B. 
12 lm 

A 21 

♦ 

, B - 

B 21 

• 

m 

* 


# 

♦ 

• 

• 


• 

• 


• 


• 

• 

a 

A 


l 

B 


mm 


mm 


where 
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Notice that the canonical form consists of a finite number of 
parametrizations , one for each set of Kronecker indices. If rank B = m 
is assumed, there are in fact /n - 1 \ sets of possible Kronecker 

' m - 1 ' 

n - 1 v different parametrizations are required to 
m - 1 / 

make up the canonical form* Clearly if rank B < in is possible a larger 
number of parametrizations are required. 

The transformation that is of interest in identification is 

T 

(A,B,C) ■* (T A T _1 , IB, C T _1 ) . 

A canonical form under this transformation with (A,B) controllable 
is given by A and B in the form of Corollary 23 with C completely free. 
This is a canonical form because given any minimal triple (A,B,C) there 
is a unique transformation (= P - * 1 , see Theorem 2.4) taking the pair 
(A,B) to the canonical form, and hence the addition of the C matrix 
does not alter this . Note that this canonical form represents non- 
minimal input/output responses in a nonunique way, which is because 
input/output equivalent nonminimal systems are not necessarily related 
by a nonsingular matrix as above . 

2 * 3 Other Canonical Forms 

In this section a slightly different approach to canonical forms 
is reported, and is essentially that given in Luenberger (1967) , that 
is transformations are derived which bring arbitrary matrices (A,B) into 
special forms. 
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It is shown in Luenberger (1967) that if the pair (A,B) is con 
trollable then there exists an ordered set of integers, 

< - (k , k 2 , k m >, such that, 



on (A,B) that det P(A,B,K) ^ 0, the following forms, (A,B) , constitute 

T -1 

a canonical form for (A,B) under the transformation (A,B) (TAT ,T B) 
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Fur the r. the transformation , T, taking any (A,B) to this form is given 
by T = P _1 (A,B,K) . 

Proof : That the transformation T = P _1 takes any (A,B) , satisfying 

det P(A,b,k) ? 0, to the above form is stated in Luenberger (1967) 
and is easy to verify by straightforward manipulation- It is also 
easy to demonstrate that for any pair (A,B) in the above form, 

P(A,B,K) - 1^. Now suppose two systems in the above form are 

*** *+ 

equivalent, say , A^ = T A 2 T , = T Then, 

Z n = P( V V K) = P(A 2' B 2' <] 

I = T P(A , B , K) 

- " . T = I 

and {A lf B x ) = (A 2 , B 2 ). 

That is no two equivalent systems have distinct representations. 

m 
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Notice that if v/e were not restricted to a particular set K 

this would not be a canonical form, because a particular system may 

have several sets K. for which det (A,B,K.) ? 0 which is why the more 

x 1 

complicated forms of Popov etc. are required. 

Also observe that if the indices k\ = i - l,..,m, the 

Kronecker indices, then the resulting form will be the same as that 
in Corollary 2.3, since the appropriate entries will be zero. 


Now a different canonical form will be derived (also from 
Luenberger (1967)) where the transformation is more complex, as follows. 

i 

Define: O. = £ K. 

1 j=l 1 


e. — the O . r ow of P (A,B,K), i — l,..*,n. 

x 1 


T * 


V 


* A ' 1 

e i A 


m 

e A 
m 


A 

B 


TAT 
T B. 


e A 
m 

-1 


k -1 
m 
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Theorem 2.5 : Using the above definitions the forms of A and B are as 

follows, (assume > 0 Vi) , 
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Further if (k^ , . 


then for i f j . 


. ,k ) are chosen to be the Kronecker invariants, 
m 


B ij 


- 


0 

Y ijk.-1 


k. ,1 
x 


if k. < k. 
D i 

and j > i. 


otherwise 


and this latter form is canonical. 
Proof : 

1) Structure of A 

We have that T A = A T 


i.e. 


[v 1 


e i 

v 2 


e i A 

* 



: A 

e l A 

11 

* 

* ! 
• 

* 

• 

ft 


• 

• 

ft 


* 

ft . 

e A 
m 


ft 

ft 

• 


ft 

% 


ft 

nr - 

B 

\ 


k -1 

- ^ 
e A 
m 


Since T is non-singular (see Luenberger (1967) , Appendix II) , 
, th 


the j row of A = [00 0 10 


0] 


th 


(j + 1) position. 


(if j ^ a, for any i) 

Let the 0. th rows of X - ^ Ho' "V i2o' 


.y. . .1 

link -1 
m 
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Then {y. ) are specified uniquely by the equation, 

1JK 


k.-l 


(*) 


k. mi 

.A 1 = l I Y .. 
l 'll 


j=l k=0 

It just remains to show that, 


.k 

..A b. 

jk 3 


i 1 , 2 , • • • m 


y, .. - 0 for j such that k, > k. 

1 3 k 3 i 

and k « k^ ^ + 1,..., k_. - 1. 

This is easily proved inductively using the following fact. 

Fact: If for some s satisfying k. - 2 > s > k , 

] - - i' 

- 0 V j such that k^ ^ s + 2 

and k such that k_.-l^>k>^s + l 
then Yj^ = 0 V j such that >_ s + 1, 


and k such that k. - 1 > k > s. 

D ~ - 

Proof: Let £ e {l m} be any integer such that k^ :> s + 1. Take 

. . k f -s-l 

the inner product of equation (*) with A b„ to give 

Xr 


(**) 


e . A 
l 


W"’ 1 


k.-l s-1 k+k ff -s-l 

b * ■ *o. * h + l * »« 

k=0 


m min (s, k.-l) 

* l l 3 

j=l k=0 
&*> 


k+k.-s-l 

Y ijk A b £ 


Now by definition of e^ as the cr^ tb row of P~* 


we have 
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1 if i - j , k - - 1 

e. A k b. = 0 if i = j, k=0,l,...,k. -2 

13 i 

0 if i ^ j, k = 0,1, •♦•kj - 1. 

Therefore in equation (**) since i ^ i (k^ >_ s + 1 k^ + 1) , we have 
LHS = 0 since k_^ + k^ - s - 1 k^ - 1 
first summation = 0 since k + k^-s-l£k^-2 
second summation = 0 since k + k^-s-l<_k^-l 
k -i 

e A A b £ = 1 

Therefore “ 0 and this can be repeated for all j such that 


k 



s + 1 proving the result. 


2) Structure of B 

We have B = T B and therefore 



and using the values for e^A b^ given previously the result follows 
easily. 

If (k_,k_,...,k ) are the Kronecker indices and j is such 
12 m 

that k, < k, , then A b. is not a regular vector if k = k, r k, + l... # k, - 1, 
3 1 3 3 3 i 

and is therefore a linear combination of its regular antecedants. (see 
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k k i _1 

section 2.2 for definitions). Therefore e.A b 4 - 0 unless A b. 

^ 3 1 

is an antecedant of aV , which occurs only if k - k. - 1 and j > i, 

j 1 

thus proving the result. 

3) Canonical Form 

That the above form is canonical follows from the observations 
that for any set of Kronecker invariants, (i) the above form has a 
representation of any (A,B) with these invariants, and (ii) this form 
and Popov's canonical form have the same number of real-valued free 
parameters. 

Note that unless all the Kronecker indices are equal there 

th 

will always be some zeros in the CL rows of the A matrix. This 

canonical form is preferable to that of Popov when the effect of 
feedback is being studied, since feedback will only alter the CL ^ 

rows of A, However note that when the k. are not the Kronecker indices 

i 

the parametrization is not as simple as that of Theorem 2.4, 

A canonical form for the controllable triple (A,B,C) under the 
T 

“1 -1 

transformation (A,B,C) (T A T , T B, C T ) is again given by (A,B) 
in the above form with C free. 
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2 * 4 Re marks on using Canonical Forms for Identification 

Two canonical forms for linear dynamical systems have been 
derived in sections 2.2 and 2.3. Both of these have parametrized the 
pair (A,B) under the assumption of controllability/ and made the C 
matrix arbitrary, dual canonical forms can be found if the pair (A ,C ) 
is parametrized in the same way assuming observability , with the B 
matrix arbitrary. 

For single input systems the canonical form given in section 2*3 
is the well-known standard controllable form, and for single output 
systems the standard observable form is given by Popov’s canonical form 
for the pair (A f ,CV). In both these cases a single canonical para- 
metrization is required, as given by the previous canonical forms. 

The canonical forms include exactly one parametrization with 
nm + np degrees of freedom, the so called generic case, and this para- 
metrization can represent "almost all" systems of order n. (This 

occurs when the first n columns of [B, AB,..., A B] are independent) . 
The other parametrizations which will have fewer degrees of freedom 
are then necessary to represent the boundary of the generic parametri- 
zation , and since this boundary is geometrically unpleasant many extra 
parametrizations may be required* 

One may be tempted to suggest ignoring the lower order para- 
metrizations since they have measure zero in some sense, but this is 
a fallacious argument for two reasons. Firstly it is analagous to 
saying that almost all square matrices are invertible, so ignore singular 
matrices, which is clearly a numerically ill-advised step. Secondly the 
non-generic systems are bound to occur in some natural situations, for 
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example the following example of two systems connected in parallel. 
Example 2.3 : Suppose a system is composed of the parallel connection 

of two subsystems with distinct inputs. 


i.e. = 

Ax, + B 

„ U , £ ~ 



1 1 

11 2 2 2 

2 

2 

y = 

y l + y 2 ^ 

= C 1 X 1 + C 2 X 2 




(i) 

(i) 



where x. Z R n 

, u. 

e R m i = 

1,2 . 


r 

X 




Then the composite system has 



A = 

" A 1 0 " 

, B = 


0 


_° V 


0 

B 2. 

and the controllability 

matrix, 



w = 

B, 0 A, B 

, 0 ... A,”" 1 B, 

0 



1 1 

1 11 




0 B. 0 

A.B„ 0 A. n ~ 

"b. 



2 

2 2 2 

2 



v U> (2) 

where n = n + n 

Now in many cases the first n columns of W are dependent, 
regardless of the Kronecker indices of the subsystems. 

e.g. for n (1) = 3, n (2) = 1 m (1) = 1, m (2) = 1. 

Then the fourth column of W is dependent on the second, so the system 
cannot be represented by the generic parametrization. ® 

In some sense such cases are not likely to occur because we 
have assumed zero coupling between the states of the subsystems, which 
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will occur almost never if one takes the usual measure on the real line. 
However such situations often occur in practical composite systems and 
cannot be ignored. 


A major disadvantage of using canonical forms for identification 
is that the realization of the systems close to the boundary of a parti- 
cular parametrization become numerically ill-posed , and this seems to be 
inherent in canonical forms of this type. Some difficulties of this 
type are illustrated by the following example. 

Example 2.4 : Let n=3,m-2,p=2. For the Popov canonical form, 
there are two parametrizations for this case if B is assumed to have 


rank 2. 


0 

a no 

“210 



1 

0 

1) n^ — 2 f = 1 , 

\ " 

1 

a m 

a 

211 

9 

B 1 = 

0 

0 



0 

“l20 

V 

220 J 



0 

1 



a no 

1 ° 

a 210 



1 

0 

2) n 1 - 1, n 2 = 2, 

A 2 - 

a i20 

j 

1 ° 

a 220 

t 

B 2 " 

0 

1 



_ a i21 

1 1 

a 221_ 



0 

0_ 


with C arbitrary in both cases. 

Now suppose we wish to realize the transfer function, 


G ( s ) 






with 0 < e < < 1 


The realization in the canonical form will be, 



-34- 



0 0 o 


1 

o 

o 


10 0 

11 

H 

< 

X o e _1 

' B i = 

0 0 

' c i = 

_0 e o_ 


IT 

O 

O 


_0 1_ 




and as E •* 0 the £ ^ in A^ tends to infinity compensated by the e -*■ 0 

in the C matrix. Hence very small errors in identifying the (2,2) ele- 
ment of the matrix C will give large errors in the transfer function. 

The above undesirable behaviour is not due to some pathalogical 
property of the system which remains of order 3 for all real-valued G , 
but is entirely due to the parametrizations we have chosen for the 
system* Therefore if one wishes to use a canonical form, and the system 
being identified is close to, but not on, the boundary of one parametri- 
zation, then there are two possible courses of action* Firstly one 
can assume the system is in fact on the boundary and use a parametrization 
with fewer degrees of freedom, with the inherent loss in possible 
accuracy. Secondly one could use the correct parametrization and endure 
the numerical difficulties indicated above. The former approach is 
probably preferable, bearing in mind that the data will in general be 
imperfect. An approach similar to this has been suggested by Weinert and 
Anton (1972) and Tse and Weinert (1973) ♦ 

Notice that in order to determine the Kronecker indices of 
(A,B) , it is necessary to verify that certain vectors are dependent, 
unless we have the generic case. Such a test is very difficult in a 
statistical setting because all determinants will in general be non zero, 
and some threshold will have to be established. However the independence 



-35- 


of a set of vectors is a relatively easy and well-posed statistical 
problem, as will be illustrated in the next section. Therefore it is 
relatively easy to find a set of integers K = (k^ , k^, . . . , k ) summing 

to n such that the matrix P(A,B,K) is clearly non-singular, where 

k -1 k -1 

P<A,B,K) = [b. , Ab_ ,.. ., A 1 b, b , Ab A m b J. 

11 1mm m 

This latter observation is the basis for an alternate approach 
to the parametrization problem, which avoids all the difficulties men- 
tioned above, and is as follows ♦ Once one has selected a set of integers 

K = (k , k k ) such that the columns of P(A,B,K) are clearly 
l z m 

independent, then one can use the canonical form for this set K given 
in Theorem 2.4. This parametrization will then be well-posed for all 
systems (A,B) such that the columns of P(A,b,k:) are reasonably inde- 
pendent. (i.e. all systems in a large neighborhood of the nominal 
values for the system) . A particular system could have a finite number 
of different realizations using this method if several different sets 

make det P(A,B,K^) ^ 0. However this should not be a practical problem 

since canonical forms will be used because there is no obvious physical 
interpretation for the states , so that the particular realization is 
not important. Also if one were comparing the identified parameters of 
a system in two situations one can artificially assign the indices for 
the second situation to be the same as those in the first, so that the 
parameter values can be compared directly. The example 2.4 given earlier 
in this section would be realized as follows. 
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Example 2.4 (continued) 

-1 


G (s) — 


es 


-2 


-2 


CB 


CAB 


CAB CA B 


0 

0 

0 

1 


0 

e 

o 

o 


Clearly the columns 1, 2 and 4 of the Hankel matrix are inde- 
pendent so we will set k^ - 1, k 2 = 2. Therefore the parametrization 

will be, 


A = 


x 

X 

X 


0 

0 

1 


n 


B = 


C « 


x 

X 


where x denotes a free parameter. 

In fact G (s) is realized by 


A = 


o 

o 


"l o" 







10 0 

0 0 0 

r B = 

0 1 

o 

11 






0 0 1 

e l 0 


'o 

O 

1 


— — 


which is clearly well-posed for all e. 1 

The family of parametrizations suggested here is thus a series 
of parametrizations , with each one representing "almost all" systems, 
and a particular one is chosen by finding the most appropriate set of 


integers K = (k. , . . . ,k ) such that det P(A,B,K) ^ 0* 
JL m 


The assumption that K is known is only a slightly greater 
assumption than knowing the order, in that in order to determine the 
order of a system one has to essentially find a set K. 
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2*5 A New Realization Algorithm 

Here we use the ideas presented in the previous two sections 
to derive a realization algorithm, (i.e . to find (A,B,C) satisfying 

CA B - H for some given H , k - 0, 1 , 2,..). It has the pleasing 

K Jv 

property that it produces A and B matrices in the special form given 
in Theorem 2.4 and uses no more computational effort than other methods 
(e.g. Ho and Kalman (1967)). We first give a preliminary lemma proved 
in Luenberger (1967) . 

Lemma 2.6: If (A,B) is controllable the following algorithm generates 

n independent vectors, P(A,B,S(n)). 

Let S(r) - (s (r) , s_ (r) , . . . , s (r)) 

±2 m 

where (r) >_ 0 for all j = 1, 2, ... m 
m 

and £ s . (r) n. 
j=l 3 


Let P(A,B,S(r)) be as defined in Section 2,3. 


Algorithm 


1) Set r = 0, s^(O) - 0, j = l,....m. 

2) pick any j e {l, 2,... m} , say j, such that 


S'? (r) 

A ^ b* 

j is independent of P(A,B,S(r)) 


Set s , (r + 1) 
D 


rs.(r) +1 if j = j 
|^ s j (r) if j ¥■ j 


3) increase the index r by one. 

4) if r < n then return to 2) , otherwise stop. 
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The above lemma produces a basis for the matrix 
W = [ B , AB , ... A n-1 B] of a particular type, (i.e. if A b^ is in the 
basis then so are a\>^ for & < k) f and the integers s_, (n) are such 

s . (n) -1 s An) 

that A ^ b. is in the basis and A b. is not in the basis, 

J 3 

The following realization algorithm produces such a basis for 
the columns of the Hankel matrix (see Brockett (1970)) in the same way 
as Lemma 2,6. 

Theorem 2.7: Given that the rank of the infinite Hankel matrix 



is less than or equal to N , the triple (A,B,C) as given by the following 
algorithm is a realization of , i.e. CA^B = for k = 0, 1, 2,... 


Firstly define h^ 


as the 5,^ 


column of the finite Hankel matrix 



Algorithm 

Step 1: initialization 


r = 1 


s^l) 

V 1 ’ 



jj 1,2,. . TCI . 
j = 1,2,. .m. 
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Step 2; 

A A r\ 

a) Choose Z such that Z = ms^ (r) + j for some j e {1,2,..., ms 
| | f£(r) | | 2 >_ | j f^(r) | j 2 , V Z > Z and Z e {ms_. (r) + j }™ =1 

> |jf.(r)|j 2 , V Z < Z and Z e {ms. (r) + j} 1 ! 1 , 
x. -j ]-± 

b) set e£=fj(r), « r+1 - L 


c) For Z = ms_. (r) + j, j = 1,2,.. ,,j - 1 , j + l,..m. 


set 


Y 


Z,Z 


< V e £ > 

< e r e £ 5 


= f^r) - 

d) For q = l,....,r + 1 set 


< h £+m ,e H 5 

q 


' £+m , & 


q < e z ,e z > 
q q 


r+l 


set f £+m (r+1) = h ^ m -It 


fc+m L Z+m,Z Z 
q=l q q 


e) Set 


s . (r+l) = ‘ 


's (r) + 1 


.Sj (r) 


3 = 3 
3 j j. 


Step 3: Increase the index r by one. 


Step 4: If | jf^r) | | 2 - 0 for all Z = ms., (r) + j, j = l,2,...tn, 

90 to Step 5, otherwise return to Step 2, 

Step 5: Set n = r - 1, k, ~ s.(r) -1 j = l r ., f ,m. 

3 3 


and 


then 
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Set (A,B) in the form of Theorem 2.4 given the above {kj and ot_ k . 

P 

Step 6: Let C=[c n c„...c3 c.e 

1 2 n i 



Remarks : 

1) The above algorithm produces a sequence of independent columns of 
the Hankel matrix , , q = l M ,. r r+l, which will be of the form 

q 

k 

M A b , , where 
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according to the rule given in Lemma 2.6. Since the realization will 
have dimension n, (which equals the rank cf } , the rank of M is n. 
Therefore any linear dependence or independence of the columns of 

will be exactly the same as for the columns of W = [B, AB...A N- *B]. 

The set of independent columns of H N is represented as an 
orthogonal set by a Gram-Schmidt type procedure, and the vectors 

th s.(r) 

eligible to join this basis at the r step (i.e. M A ^ b. , 

3 = are represented in terms of this basis and a component 

orthogonal to it, and hence if the orthogonal component is non zero it 
is independent. The orthogonalization procedure also produces Y 

ij 

s . (r) 

which give the dependence of the vectors M A 3 b . on the basis, 

3 

then the 0^ ^ required in the A (and perhaps B) matrices can be found 
h Y inverting an upper triangular matrix, which is computationally 
very easy. 

Once the basis is found the C matrix follows immediately since 
P(A,B,K) = 1^. A formal proof of the algorithm is not given but it 

is clear from an understanding of Theorem 2.4. 

2) The rule for selecting the new vector to enter the basis is to 
take that vector, of those eligible, with the greatest component 
orthogonal to the basis. This rule is chosen because it ensures that 
the basis has a determinant far from zero. Further if the data is 
noisy the chosen basis will remain independent for comparatively large 
variations in the parameters. 
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3) In the stochastic case such a method would be well-suited for finding 
a parametrization and approximate values for the parameters , to be used 
subsequently in a more efficient identification method, (e.g. maximum 
likelihood) . The selection procedure for the basis works best if the 
inputs are of similar magnitude and the outputs are observed with similar 
accuracy. Ideally one might want to choose a basis, , with the least 

probability of becoming ill-posed. For example if q(K J is the largest 

probability such that in the r) (hO confidence region det P(A,B,K^) ^ 0, 

then one might choose the set with the greatest T](Kj* 



CHAPTER 3 


PARAMETER IDENTIFIABILITY FROM INPUT/OUTPUT OBSERVATIONS 


3.1 Introduction 

In this chapter we consider the identification of systems de 
scribed by linear differential or difference equations; 


dx(t) 

dt 


■= Ax(t) + Bu<t) , 


y(t) = Cx(t) + Du(t) 


or 

xtk+1) = Ax(k) + Bu(k) , y(k) = Cx(k) + Du(k) 


vhere xeR n r ueR m , y £ R P , A 
Also define N = -n(m + m + p) + mp, 
matrices,. 


e R nxi \ b e R nian , c s R pxn , D e * pxm . 
the total number of elements of the 


The problem is to identify these system matrices from input /output 
observations. As explained in Chapter 2 there is no unique solution to 
this identification problem because there are infinitely many equivalent 
realizations of a particular input/output response. In Chapter 2 it 
was shown how canonical forms can be used to overcome the non-uniqueness 
-problem* In the present chapter it is assumed that the system equations 
are derived from physical knowledge of the system. That is, the elements 
of the A, B ,C , and D matrices are either, 

1) zero, 

2) known physical constants, 

or 3) known functions of some unknown parameters. 

Thus if the unknown parameters are denoted a £ ft C R ^ , then the matrices 
may be written as A (a) , B(a) , C(a) , and D(a) where A:ft + R nXn ; 
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-> R nxm ; c R^ xn ; an £ -► R pxin , That is the system matrices 

are parametrized by the unknown parameters , cu 

In practice it is very often the case that such equations can 
be postulated with relatively few unknown parameters and this is a 
very useful way of incorporating one’s a priori knowledge , (e.g. in 
aerospace problems) . The identification problem is then to find esti- 
mates of the unknown parameters based on the observed data. 

When such a model can be formulated it has two main advantages 
over using canonical' forms (as given in Chapter 2) . Firstly the para- 
meters being identified have a physical interpretation and secondly 
for multiple input/multiple output systems, the canonical forms have 
the disadvantage that a set of integers (e.g* the Kronecker invariants) 
must be determined before the real valued parameters can be identified. 

A natural question that arises in the context of such identi- 
fication problems is whether or not the unknown parameters, a, can be 
identified from observations of the system. This is the so-called 
identifiability problem and will be the subject of this chapter. Firstly 
we will give some simple examples to illustrate the main concepts. 

Example 3 * 1 

Consider the two parametrizations of a single input/single output, 
second order system. 



"-i a~ 


0 

1) A(0l) = 

L 

, B(Ot) = 



0 -2 


_ a 2. 


ct cx 

This will have transfer function, G(s) = 1 2 

(s+1) (s+2) 
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Here only the product a^cx^ can be identified and neither 
nor a 2 can be identified individually. (This system would thus be 
said to be not locally or globally identifiable) . 


2) A (a) - 


a i 1 


a. 


B(a) = 


c(a) = [l 0] 


This will have transfer function, G(s) = 


(s-o^) <s-a 2 ) 


In this case if a ^ a then a and a can be uniquely identified 

Ji 4 JL ^ 

in a neighborhood of their nominal values. However if and have 

their values exchanged then the transfer function is not altered, and 
therefore the parametrization would be called not globally identifiable. 
If however and are restricted to be in the set 

{ (a # a 2 ) e R |a > a 2 >, 

then and are globally identifiable. S 


Example 3.2 

Consider the system with n = m = p = 1 and a,b,c arbitrary 

cb 

real numbers, then the transfer function G(s) = and clearly (a,b,c) 

s-a 

is not identifiable. However if cb / 0 then a can be identified inde- 
pendently from b and c. This parametrization will be called partially 
identifiable in a, independent of b and c. B 

The above examples illustrate that the identifiability of a 
particular parametrization is not obvious and has several aspects to it. 
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In the following section the solutions to most of these identifiability 
questions will be given. 

3*2 Local Identifiability 

Identifiability of parameter means roughly that parameter esti- 
mates can be determined which are asymptotically exact* Identifiability 
will thus depend on the data available, and in this chapter we will 
assume that we could obtain asymptotically consistent estimates of the 
transfer function and nothing else, which is assured by the following 
assumptions ; 

Al) Both the input and output are observed, perhaps with observation 
noise . 

A2) The input is independent of the observations and is persistently 
exciting (that is the input excites all the system modes, see 
Astrom and Bohlin (1966)) * 

A3) The observation noise statistics are such that the system transfer 
function, or Markov parameters, can be identified asymptotically . 

A4) The system was either started an arbitrarily long time before 

identification was started, or that the initial condition was zero. 

The assumptions imply that if noise is present on all the ob- 
servations then the system must be stable , since otherwise A2 implies 
that some outputs would tend to infinity with increasing time. The 
correct way to identify unstable systems in the presence of noise is 
to insert a known stabilizing feedback system and identify the resulting 
composite system, from which the open-loop system could be deduced. 

Assumption A4 is included so that no more than the transfer 
function can be identified. For reachable systems there is no dif- 
ference between the cases with the initial condition zero and non-zero 
but unknown, since in the latter case the initial condition can be re- 
placed by an equivalent input* However for unreachable systems the 
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initial condition response gives additional information that is not 
available from an input/output test, but such information will only be 
finite if the observation noise covariance is positive definite. 

Under the above assumptions the following definition for local 
identifiability of a system parametrization , as given in Section 3.1, 
is natural if one has nominal values for the unknown parameters, (e.g. 
wind tunnel tests on an airframe) . 

Definition 3.1 

Let (A,B,C,D) (a) : ft C R q -+ R N (N = n(n+m+p) + mp) , be a 

parametrization of the system matrices (A,B,C,D) of a linear dynamical 
system. This parametrization is said to be locally identifiable (from 
the transfer function) at 01 “ 01 £ ft if there exists an e > 0 such that 

(i) | |a - a| j< e, | |6 - a|j< e, a, £ e ft, 

and (ii) C (a) A k (a) B (a) = C (3) A k (3) B (£) , k« 0,1,2.. 
imply a = 3. 0 

In other words, in a neighborhood of a, there are no two systems 
with distinct parameters, which have the same transfer function. This 
definition is similar to the definition of "non-degeneracy" as given by 
Kalman (1966). Definition 3.1 is equivalent to requiring that the map 
from the parameters, a, into the Markov parameters is locally one-to-one. 

A standard result on injective maps is given by the following lemma. 

Lemma 3.1 (Rank Theorem, see for example Narasimham (1968) page 18) . 

Let ft be an open set in R n and f : ft -*■ R m be a C k map. Suppose 

that rank — ^ = r for all x in ft. Then there exist open neighborhoods 
3x 
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U of a and V of b = f(a) , cubes Q,Q' in R n , R m respectively and C 
diffeomorphisms u : Q -*■ U and u' : V Q', such that if <t> = u’ o f o u 
then <j) has the form 

^(x^x^, * « • $ f x^ r • « x^ / 0/0,...0) 

(Note: a cube in R n is a set of the form (x| |x_. - a.. ] < r..}, and o 

denotes composition) . gg 

An immediate consequence of the Rank Theorem is ; 

Corollary 3.2 

Let ft be an open set in R n and f : SI -*■ R m be a map with 

(x) A 

k > 1. Then if tt— has constant rank r in a neighborhood of x, 

— dx 

f is locally injective if and only if r = n. 13 


We can now obtain an identif lability condition as follows. 
Theorem 3.3 

Let (A,B,C,D) (a) : ft C R q -+• R N (with ft an open set in R q ) 
be a C' parametrization of the system matrices (A,B,C f D). Then if rank 


9G (a) 
9a 


A 

= r (see below) for all a in some neighborhood of a, then the 


A 


parametrization is locally identifiable at a, if and only if r = q. 
In here G : ft -*■ R (2n+l)mp is given by. 


G' (a) 


[ d ' (a) ,( c (a) b (a)) ' ,( c (a) a (a) b (a)) 


c(a)A 2 n_ 1 (a)B(a)) '] ’ 


Further the Jacobian of G can be written as 
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3G(g) 

3a 


0 

0 

C®B' 


r=l 


2n-l 

I 

r=l 


CA 2n “ 1 ‘' r ©B'A ,r-1 


C © I 

CA © I 


CA k © I 


m 


CA 2 "" 1 (x) I 


m 


I (x) B 1 

P W 


I (x) B ' A 1 

P ^ 


I ©B’A' 

P ^ 


I ©B'A ,2n 1 

P W 


I ® I . 
Pro 


where the dependence of (A,B,C,D) on a is understood. And 


3A(a) 

3a 


M(a) = 


3b (a) 
3a 

3c(a) 

3a 

3D(a) 

3a 


where if X is an nxm matrix given by X’ = [x, ,x„,...x 3 , with x. £ R 

12 n l 

then X is the nmxl vector given by X' = [x ',x x ' ]« Also © 

x n 

denotes Kronecker product (see Appendix I) . 


m 


Proof : If a system has order less than or equal to n, the set 

2 n _^ 

(D,CB,CAB, . . . ,CA ) is sufficient to determine all subsequent Markov 
parameters. Thus G is locally injective if and only if the function 
from a into all the Markov parameters is locally injective. Therefore 
the result follows immediately from Corollary 3.2, and it only remains 
to show that the Jacobian of G is as given above. 


M(a) 



-50- 


lin j 1 

h 4 0 jh 


lim | 1_ 
h 4 0 |h 


(C + hAC) (A + hAA) k {B + hAB) - CA k B 


]l 


hACA k B + hCA k 


AB + h l CA k “ r AAA 1_1 B + 0(h 2 )l | 
r=l J ’ 


= ACA k B + CA k AB + 


k 

l 

r=l 


CA k - r AAA r - 1 B 



and the expression given for ^ is obtained by ordering the elements 

Eg 

of Aa, Ab, Ac, ana Ad as given for M(a). m 


The expression above reduces to the evaluation of a q x q 
determinant. It is however unnecessarily complex if we know that a 
system is of minimal order, in which case we know that all equivalent 
systems are related by a similarity transformation as explained in 
Chapter 2. 

The following theorem gives conditions for the local identi- 
fiability of minimal systems. 

Theorem 3 . 4 

Let (A,B,C,D) (a) : C! R ^ R N (with Si an open subset of ^ 

be a C' (i.e. continuously differentiable on fl) parametrization of the 

system matrices (A,B,C,D) and suppose (A,B,C,D) (a) is minimal. Then 

1) (A,B,C,D) (a) is locally identifiable at a = a if and only if 

F : GL(n) x ft R N is locally injective at T = I and a = a , where 

F(T,a) = (TA (a) t _ 1 , TB(ct), c(a)T" 1 , 


0(a)). 
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if rank — I,a ^ = r for all a in some neighborhood of a, then 
3 (T,a) 

/s 

(A,B,C,D) (a) is locally identifiable at a = a if and only if 
r = n 2 + q, or equivalently det[X' (a)X(a) 3 ? 0, where 





M(a) and notation are as defined in Theorem 3.3. 

Proof 

1) Necessity 

If F is not locally injective then for all £ > 0 there exist 
(T ,a ),(S ,6 ) £ N (I, a) such that F (T ,a ) = F(s ,3 ) and therefore 

s e'\ Ua c >T c'\ ’ MI V 

s e'V ( ?e> ■ B(6 e> 

c <V T e" = c(6 e > 

D(a £ ) = D(& e ) 
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Therefore there are equivalent systems in an arbitrarily small 

/s 

neighborhood of a f and the parametrization is not locally identifiable. 
(Note that the fact that GL(n) is an open subset of R nXn is used) 
Sufficiency 

First note that since (A,B,C,D) (a) is minimal there exists a 

A 

neighborhood W C ft of a such that (A,B,C,D) (a) is minimal for all a e W. 

(Since minimal systems form an open set in parameter space and the 

parametrization is assumed to be continuous) . Therefore when restricted 

to W all equivalent systems are related by a similarity transformation. 

Therefore the parametrization is locally identifiable if F is injective 

when restricted to GL (n) x V, where VCW is any open set containing a. 

In order to prove the result we will prove the contrapositive . 

Assume therefore that the parametrization is not locally identifiable, 

then for all E > 0 there exist T ,S E GL(n), a 4 $ e N (a) C W 

£ £ £ £ E 

such that F(T £ ,a £ ) = F(S e ,g £ ). Therefore we have that 

s c "V = w(8 )W (a ) [w(a )w* (a ) ] 1 

t. o C* C* t, 

where w(a) = [B (a) ,A(a) B (a) , . . . ,A n_1 (a)B (a) ] 

S T is therefore a continuous function of (a ) since W(a) 
has full rank for all a £ W by the reachability assumption. Therefore 
“ I | | can be made arbitrarily small by taking £ sufficiently 

small and F(S^ T^,a^) “ F(I,3^). Hence there does not exist a neighbor- 

A 

hood of ( I r Ot ) in which F is injective, and thus F is not locally injective* 
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2) To prove this we use part (1) above, and Corollary 3.2* First we will 
compute the Jacobian of F or equivalently F , which is given by , 

n ~ (F (T + h<$T,a) - F (T ,a) ) 
n + 0 h 

= ^ im « r ( (T + h6T)A(a ) (t + h6T) _1 - ta ( a) t " 1 , 
h 0 h 

(T + h6T)B(a) - B(a), C(a) (T + h6T) “ 1 - C(a) , 0) 

= * im i (h6TA(a)T _1 - hTA{a)T~ 1 6TT _1 + 0(h 2 ), 
h ^ 0 h 

+ h6TB(a) , - hC(a)T -1 <$TT -1 + 0(h 2 ) , 0) 

= (6TA(a)T _1 - TA(a)T -1 6TT"' 1 , STB(a), - c (a) t -1 6tt - 1 , o) 


Therefore using notation of Appendix I , 


3f (t r a) 
3 t 


-1 -1 - 1 ' 
I © T 1 A' (a) — TA(a)T @T 

i ©b' (a) 

- c(a)T -1 (x) t -1 


Similarly 


9f(T,cx) 

3a 


— 

-1 1 

T ® T 

1 0 
1 

! o 

0 


0 

' T ® I 

i 

1 0 

0 


0 

i 

i 0 

1 1 <S) T -1 ' 

1 0 

M(a) 

0 

i o 

1 o 

1 1 O I_ 



and thus 


3F(T,a) 3 f (T, a) 
3 t ' 3a 


T & T ~ 1 

■I 

0 

0 

1 

0 

0 

T' 

T © I 

1 0 

"T 

0 

— 

■f ■ 

— 

-r.+ 

— 


t ■ 

_ _0 

|I ®t' 

-f 

_0_ 

0 

1 

0 

1 0 

l 1 

® I 


r 

1 

-i i 

T ® I 

0 

0 

I 


q 


x(a) 
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Therefore since T C GL(n) the rank of the Jacobian of F at 
(T,a) is equal to the rank of X(cO , hence the assumption that rank 

A Pjp /m 

x(a) = r for all a £ N (a) implies that rank — ' = r for all 

£ d(T,Ot) 

A 

(T,a) in some neighborhood of (I, a). Therefore the assumptions of 
Corollary 3*2 are valid and the result follows immediately* 

B 

Remarks 

^ 2 2 

1) If rank X(a) ~ n + q then rank X(a) = n + q in some neighborhood 

A 

of a, and hence the condition for identif iability is simply 
det[X‘ (a)X(a) ] / 0. Requiring that the rank X(a) is constant in a 

A 

neighborhood of a is specified so that unidentifiable parametrizations 
can be found from the test. Those systems which have rank X(a) < q at 
a = a but not in a neighborhood of a may or may not be locally identi- 
fiable , however it can be said that the sensitivity of the input/output 
response to certain small changes in the parameter values is zero* 

This situation is analogous to trying to estimate a from noisy observations 

3 

of a , in a neighborhood of a - 0. 

2) Theorem 3.4 gives a comparatively simple test for the local 
identif iability of a parametrization , with the unknown parameter 
entering in a straightforward manner. It is significantly simpler 
than the methods based on the information matrix (see Section 3.6) , 
and more elegant than the condition of Theorem 3.3. The computational 
comparison between the tests of Theorems 3.3 and 3.4 is not clear, in 
that although Theorem 3.3 reduces to evaluating a q x q determinant, 
whereas the condition of Theorem 3.4 involves a determinant of dimension 
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2 2 

n + q (can be reduced to n ) , the precomputing required in Theorem 3*3 
is considerable. The test of Theorem 3.4 allows some unknown parameters 
to be left as free parameters and the determinant evaluated as a function 
of them, so that regions of local identifiability can be deduced. How- 
ever in Theorem 3.3 such calculations could be exceedingly tedious. 

3) If a parametrization is locally identifiable, this ensures that 
any well-conceived algorithm which minimizes some cost function over 
the parameters will be well-posed and have a unique solution in some 
neighborhood of the nominal values. Further if a parametrization is 
locally identifiable for all values of a e ft then an algorithm will 
always be well-posed but may converge to one of several solutions 
depending on the initial parameter estimates and the actual data re- 
ceived. This is the problem of global identifiability which will be 
discussed in the next section. First we will give some examples to 
illustrate the local identifiability theorems. 

Example 3 . 1 (continued) 

(i) For the parametrization of example 3.1 (i) , 


0 

0 1 

a l 

0 1 

0 

0 

a l 

-1 1 
1 

0 

(X 

1 . 

1 

0 

0 

0 I 

1 

0 1 

0 

0 

0 

0 

a l 

0 ' 
1 

0 

0 

0 

a 2 i 

0 

0 1 

0 

0 

0 

o 1 
1 

0 

a 2 ] 

0 

1 

-1 

0 1 

0 

0 1 

0 

0 

0 

-1 1 

0 

0 ' 

0 

0 
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The last 3 columns are clearly dependent for all G 


R 


2 


and hence the parametrization is not locally identifiable for any 


and . 

(ii) For the parametrization of Example 3.1 (ii) 


x(a) - 


0 

1 


0 

0 

0 

0 

-1 

0 


v a i 


0 

0 

1 

0 

0 

-1 


-1 

0 


0 

-1 


a i _a 2 


0 

0 


0 

0 


0 

0 

0 

1 

0 

0 


I 1 

I 0 


0 

0 


0 

0 


0 

0 


0 

0 

0 

1 

0 

0 


And det [X 1 (a)X(a) 3 = (c^ - a^) 2 and so it is locally identifiable 
if ^ however the region of local identifiability is small when 

A, A A A 

- a^. The fact that it is not locally identifiable at is 


true but cannot be deduced from the theorems. 

For comparison using Theorem 3.3 for this parametrization gives 


9G(a) 

~9a 


0 1 

0 l 

0 

0 

1 1 

o 1 

0 1 

1 | 

0 

0 

1 o x 

i 1 

1 1 

1 

V°2 [ 

2 2 * 

0 

1 

1 2 

t aji 
' 3 

V°2 

2 t 2 ' 

2 V a 2 j 

a i +a iV a 2 | 

1 

V 2a 2 

i a 
1 1 

“l +a i“2 +C, 2 | 
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0 

1 


V a 2 


2 2 

a, +a,a, +a„ 

1 112 


1 

a. 

J. 

a„ 

a„ 


0 

0 

0 

1 
0 
0 
0 
0 


0 

0 

1 


20i+0 2 


0 

0 

1 


2a 2 « 1 


and det 


M)’ (B.) 

\dal 1 9a/ 


= (a 1 *a 2 ) 


Example 3.3 

A simple minded extension of the standard controllable form 
for single input systems is given in the following proxx>sition, and 
it is shown that except when all the indices are equal the parametri- 
zation is not locally identifiable anywhere. 

Proposition 3.5 

Consider the set (k^ ,k 2 , . . - ,k ) summing to n, then the para- 

metrization of the system matrices (A,B,C) given below is never locally 
identifiable anywhere unless the are all equal in which case it is 


locally identifiable for all a £ £n(m+p)^ 

C is completely free. 

A and B are block matrices 


A = (A . . ) . . , 

JO 1,3=1, ..m 


B - (B. .) . . _ 

13 1,3=1 , . . ,m. 
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where x's are free parameters. 

Proof 

The local identif lability condition of Theorem 3.4 is equivalent 


to, 

Q A (a) - A(a)Q = Aa 
(*) Q B(a) = Ab 

- c fi(a) = Ac 



o 


where (Aa, Ab, AC) represent admissible local variations in (A,B,C) . 

A 

(i.e. if the implication holds at some a then local identif iability 
results , and if the implication does not hold for all a in some neighbor- 

A 

hood of a then it is not locally identifiable) . 

Nov/ because of the structure of (AA, Ab, AC) (*) is equivalent to 
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(**) 


E (Q A (a) - A(a)Q) = 0 
C B = 0 


Q = 0 


where 


Let 


E = 


I k 1 -1 °k 1 -l , 1 


Q - 


^ml 


k 2~l k 2~ 1 ' 1 


e il Q 12 ’ * ' Q lm 


; 21 


mm 


_li 


| T k -1 °k -1, 
1 m m 


with Q. . a (k. x k.) matrix. 
a-D i D 


Then Q B = 0 

M 


Q. . B. . =0, 
3-D 3-3 


t ^ — 0 for & — l,2 r ..,k^. 

3 ' j 

The i-j th block of (**) gives 


or the £,k. th element of 
3 


, m m . 

[V V 1 ' 1 ] IJx 8 ik V°’ " ii Aik<ale ^ I 


= 0 


And since . = 0 this becomes 

'W ^0 


Ti , o ] ig..A..(a) - A..(a)Q.. } = 0 

[ \-i' k.-i,ij r iD j3 11 13 1 


that is for i,j = 
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0 q il q i2 q i,k.-l 

ij 

q 21 q 22 ’ * ‘ q 2,k -1 ° 

q 3i : ? 

3 

0 q 2i : 


41 9 • 

• * 

! 

0 q k . -1 , 1 q k.-l,k.-l 


» » 

• 

q k.,i q k.,k.-i 0 

i ' 13 



Therefore there are (k^-l)k_. equations in (k_.-l)k^ unknowns, and 
if k k k there are itiore unknowns than linear equations and a non“zero 

i j 

solution exists. If k^ >_ k_. there is a unique solution for (i.e. 

zero) and this can be verified recursively, (q^ = 0 q^ + ^ 2 = ° ^ 

• * .^> = 2,3,.., k and q n ^ = 0 — q^_ ^ ^ _2 ~ 0 ... — 1,2,.. ,k^ — 1) . 

' j" ' 3 

Therefore the parametrization is not locally identifiable for 

any a if k. ^ k. for some i and j, and it is locally identifiable for 
l 3 

all a if k. = k. for all i and j. (In fact it is globally identifiable 
^ 3 

in this case by Theorem 3-6 , see next section)- ES 

Some authors imply that the above parametrization is useful. 

(e.g. Jordan and Shridar (1973)) but the above proposition shows that 
it is rarely identifiable, and hence there will not be a unique repre- 
sentation of a particular system response. The correct extension of 
the standard controllable form is given by Theorem 2.5. E3 . 

Example 3.4 

We now give an example of a parametrization which is locally 
identifiable for all a € R but is not globally identifiable. 
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— 

-1 0 



2 + a 

A (a) = 


t 

B(a) = 



l+2a -5 



l 


c(a) = [l - a, a] 


which has transfer function, 

s(2-g 2 ) + (2a 3 -2a+10) 
(s+1) (s+5) 


and Markov parameters, 


G(a) 


2 3 

-2 - 2a + 6a + 2a 
2 + 12a - 3la 2 - 12a 3 
-2 - 62a + 156a 2 + 62a 3 


Therefore as in Theorem 3.3 


3G(a) 

3a 


-2a 

2 

—2 + 12a + 6a 
12 - 62a - 36a 2 
-62 + 312a + 186a 2 _ 


which is clearly of full rank for all a e R , and hence by Theorem 3.3 
the parametrization is locally identifiable for all a £ R « However the 
systems with a - 1 and a = - 1 have the same transfer function 


slip 
(s+1) (s+5) 


and therefore the parametrization is not globally identifiable* 


The variation of the transfer function with a is shown below in 


Figure 3 . 1 
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3.3 Global Identif lability 

As remarked in section 3.2 from a practical point of view local 
identif iability has the disadvantages that a nominal value for a is 
required, and that given local identif iability the extent of the region 
of identif iability is not easily found. Hence the concept of global 
identif iability is now defined. 

Definition 3.2 

Let (A/B/C/D) (a) : ftC R ^ + R N be a parametrization of the 
system matrices (A/B/C,D) . This parametrization is said to be globally 
identifiable (from the transfer function) if, for all a, $ e fi, 

(i) D(a) = D(3) 

(ii) C(a)A k (a)B(a) = C (a) A k (a) C (a) , k= 0,1,2,... 
and (iii) (A,B,C,D) (a) is minimal 

imply a = B. g 

Condition (iii) in the above definition could be deleted, but 
then the definition would be very restrictive since most useful 
parametrizations admit multiple representations of non-minimal systems. 

The following proposition gives a sufficient condition for 
global identif iability when the parametrization is affine. (i.e. 

f : ft™ -*■ R n is affine if f(x) = c + £(x) where Z is linear and c 

is a constant) . Affine parametrizations occur frequently in practice, 
for example all the standard canonical forms are affine. 

Theorem 3 . 6 

An affine parametrization (A,B,C,D) (a) : AC R ^ -*■ R N , is 
globally identifiable if for all a, 3 e ft, 

R(z(a,6)) fl R(z(B,a) ) 0 R(m) = {o} 
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and this is implied by det [Y ' (ot , 3) Y (a ,3) ] ¥ 0 for all ot, B E fi* In here 


z(ct,3) 


I ©a’ (a) - a(3) ©i 

i © B' (a) 

- c(S) ®i 
o 


M is as defined in Theorem 3»3* 


Y{a,$) 


z(a,3) 0 

o z(3,a) 


M 

M 


Proof 

Since we are only concerned with minimal systems, global 


identifiability is implied if the following equations have a unique 
solution for all a, 3 £ &r 


T A (a) - A(3)T ^ 

T B(a) = B(3) 

C(a) ^ C(3)T | 

D(a) = d(3) 

J 

Let = T - I and = I - T ^ then (*) is equivalent to, 
z (a, 6) 0 = M(B-a) 
or Z(3,a) Q 2 - M(3-a.) 


Which will have a unique solution if 
R(Z(ot,3) )(\ R(z(3,a) ) n R(M) - {oK Finally the condition on Y(a,$) 
is immediate. ! 


Remarks : 

“ — A a 

1) The condition is not necessary since a ~ 6) E N(Y(a,3)) 

-1 ~ ^ 
does not imply that ( X ) = I - or that a - 3 = a - 3 which 
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are required for a system not to be globally identifiable. 

2) A somewhat more restrictive sufficient condition for global identi- 
fiability is that R (z (a , 3) ) H R (M) = {o} for all a, 3 e We remark 
that this condition is in fact satisfied by the canonical forms given 
in Theorems 2.4 and 2.5 and the former case is proven in the following 
proposition. 

Proposition 3 . 7 

Let the set of integers (s ,s ,..,s ) satisfy 

0 1 m 

0 = s < s. < s ... < s < s - n. 

012 m-1 m 

Then the following parametrization satisfies det [ (Z(a,3) #M) 1 (Z (a, 3) #M) ] ^ 0 
for all a, 3 e n ( m+ P) # 


A — [a . * . a ] f B [b. * • b ] 

1 2 n 12 m 


a. ~ 
x 


free if i = s . j = l,...rru 


e i+l 


otherwise 


b. *= e 
l s .+1 

l 


C - free 


where e, 

i 


. th . , , 

l unit vector. 


Proof 

det [ (z (a, 3) ,M) 1 (z (a, 3) ,M) ] / 0 V a, 6 is implied if 

\ 

Q A (a) - A(3)Q - A A 

Q B(a) *= Ab > Q = 0 

- c(3)Q = Ac 

/ 

where (Aa, Ab, AC) are admissible variations in (A,B,C), 
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Set 


Then 


Q 

QA(a) 

A(3)Q 


p i 


Iq i q 2 *'■ 

q n ] 

tPl P 2 


[r, r_ 

r ] * 

1 2 

n 

[ Qa(a) 

if 

1 j 

| ^i+1 

o 


otherwise 


r ± = A(6)q ± 


(*) Q B = [q q s +1 * • * q s +1 1 “ fiB * 0 
1 m 


Q A(a) - A (6)Q = 6 a implies 


p . = r . for i f s . j 
ii J 


Set i = s , +1 then 
3 


V+2 ” A(6) V+l 


~ X f « * « f in * 


but q = 0 by (*) and hence q = 0 and hence recursively 

S . +1 S 


until q =0. Hence Q “ 0 and the result is proven* 

s j+! g 

Example 3*5 

The following is an example of a globally identifiable para- 
metrization which fails the conditon, R (Z (a, 6) ) f\ K(M) = {o} for all a,0. 




l + 2a 


0 

A(a) = 

X 

z 

, B(a} = 



0 

V°2 _ 


l 


C(a) = [1 0] 


1 + 2a. 


Now, the transfer function - 


s 


(2a i +a 2 )s + a (a +a 2 ) 
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and the parametrization is globally identifiable, since for (a^ot^) 

such that the system is minimal, the three coefficients of the transfer 
function can be identified, and CC 2 can be determined from the numerator 

and a can then be determined from the coefficients of s. 

Now in this parametrization det[ (Z{a,3) ,M) ' (Z (a,£) ,M) ] = 0 
if and only if 


V 


^ l“ 

+ 

’l 

#■ 

l 


V 

J 2 _ 


-l 


0 

-l 


-V 


in which case A (3) 


/ L + a ! + a 2 


-<1 + 2 a 2 ) 5 


a. 


and setting Q = 




gives 


q a (a) - A(B)Q = (l + 2a 2 )i 2 
Q B(a) =0 

- c(3)Q » o 


which are all admissible variations in A,B and C. Thus we have constructed 
a non-zero element in R(Z(a,£)) and R(M). 

If the better condition of Theorem 3.6 is used instead of the 
more restrictive one above, no non-zero solution exists verifying that 
the parametrization is globally identifiable. It is also noted that 
the parametrization satisfies the sufficient condition for local identi- 


fiability given in Theorem 3.4 for all parameter values. 
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Remarks on Global Identifiability 

Finding conditions for global identifiability has been the 
major time consumer of this research and is perhaps the least productive 
in its results. It is the purpose of this section to outline the 
mathematical problems in determining global identifiability. 

Definition 3.2 of global identifiability removes non-minimal 
systems from consideration. (see remark following the definition) . 
Therefore define U to be the largest subset of ft such that 
(A(a) , B(a), C(a)) is minimal for all a e U. Then global identifiability 
is exactly equivalent to any of the following three conditions. 

1) (Markov parameters) G restricted to U is injective (see Theorem 3.3 
for definition of G) . 

2) (Transfer function) H restricted to U is injective, where H(a) is the 
set of coefficients of the transfer function, i.e. the coefficients of 
det(Is - A (a)) and the coefficients of 

det{ls - A(cO) x [C(a)(ls - A(a)) -1 B<a) + D(a)]. 

3) (Similarity transformation) F restricted to GL(n) x U is injective 
(see Theorem 3.4 for definition of F) . 

The restriction of these functions to U makes any analysis 
intractable in all but the simplest cases, since U is not easily 
described and most useful results in global analysis require that the 
function's domain is well-behaved. 

Assuming that the above conditions are indeed intractable in 
practice, sufficient conditions can be obtained if the functions F,G, or 
H are injective without the restriction to U. However such modified 
conditions may be too strong. G will never be injective if multiple 
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representations of non-minimal systems are possible. H will not be 
injective in many cases; for example in any parametrization with the 
C matrix completely free and more than n free parameters in the A and 
B matrices combined, H cannot be injective since when C is zero the 
only non-zero coefficients in the transfer function are the n coefficients 
of det(Is - A (a)) and hence since we assumed greater than n free 
parameters in A and B, there will be infinitely many values of a with 
the same image under H. Requiring H to be injective would thus seem 
to be overly restrictive. The function F will not be injective for a 
globally identifiable parametrization, only if two non-minimal systems 
are related by a similarity transformation. This is in fact much less 
restrictive than with the two other functions G and H but there exist 
examples where F is not injective but the parametrization is globally 
identifiable, as in the following example. 

Example 3.6 



(a lf a 2 ) e R 2 


F(T,a) = (TA(a)T ,TB(a) ,c(a)T 1 ,D{a)) 



for all t 7 * 0 so that F is not injective at = 0. Notice that 
C *2 - 0 corresponds to the unobservable systems so does not affect 


global identif lability , which is ensured if we look at the transfer 
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a 2 <l + as) 

function = - , and assume / 0. For this system the 

s 

functions G and H are also not injective at = 0. Q 

We can thus conclude that a reasonable sufficient condition for 

global ident if lability is that P 5 GL(n) x ft -*■ R N is injective. Such 
a question is in general exceedingly difficult to answer without some 
additional assumptions. The general mathematical problem of determining 
whether a map from a subset of one Euclidean space into another is 
injective is non-trivial. Clearly a necessary condition is that it is 
locally injective everywhere, however this is not sufficient. If the 
image space is of higher dimension than the domain the functions which 
are locally injective everywhere but not globally injective are very 
easy to construct. (see for example Example 3.4) , and general results 
are very restrictive. 

If the domain and image spaces are of the same dimension then 
better results are available. For example Palais* Theorem (see 
Palais (1959), Wu and Desoer (1972), Ortega and Pheinbolt (1970)) which 

states that : If f is a C k map (k >_ 1) from R n into R n , then f is 

a C k diffeomorphism if and only if, (i) det 5* 0 V x e R n and 

(ii) lim | |f (x) | | - " . (Note, a C k diffeomorphism is by 

I Ml 

definition a bijective C k map whose inverse is also C k ) . This is a very 
strong and also surprising result in that the conditions are both 
necessary and sufficient* However we are only interested in maps being 
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injective and do not require them to be surjective. Palais 1 Theorem 
could be applied to single input/single output systems with 2n degrees 
of freedom, to show that H is bijective. 

General conditions for a map just to be injective tend to be 
very restrictive and hard to verify. For example in the related control 
area of the global observability of nonlinear dynamical systems it is 
required to have the map from the initial condition into the output 
sequence injective, and this has been considered in some detail by 
Fitts (1970) f who did not find any sufficient conditions for maps to be 
injective, that were not too restrictive for our present purposes. 

The global identif iability problem is complicated considerably 
by the domain of F being GL (n) x ft. However the condition that 


det 


9f *1 

rr . (T,a) f 0 for all T and a can be replaced by 

_o (T , CX) 


det a j (I #0l) j t* 0 for all a. (see proof of Theorem 3.4) . Also 

if we assume that (A,B,C,D) (a) is affine then F is a highly structured 
function. We now make a conjecture for which no proof or counter 
examples are known to us . 

(Open ) Conjecture 

Let (A,B ,C,D) (a) : R nm+np+mp R N be an affine parametrization , 

then it is globally identifiable if det ttt 

L3(T,a) 

a e R m+np+mp . 


(i, a) 


] ,° 


for all 
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If the dimension of a is allowed to be less than (nm+np+mp) 
then counter examples to the conjecture can be found (see Example 3.4). 
Further if the affine restriction is not made counter examples can 
be found (see below) . 

Example 3 . 7 

Consider the parametrization, 

2 

A (a) = B(a) = (a 2 , 1 - cx 2 ) , C(a) - a , D(a) =(0,0) then 

To 1 o o 1 



|j-a 3 o o l 

= l + a 2 2 > l v a e R 3 

However if f 0 then 



So that F is not injective (even if restricted to minimal 
systems) and so the parametrization is not globally identifiable. 3 
In conclusion it would seem from the above discussion that the 
sufficient condition for global identifiability given in Theorem 3.6 
is a good condition/ but that if the number of degrees of freedom is 
(rnn+np+mp) then the better condition of the Open Conjecture may be true. 
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3 . 4 Partial Identifiability 


An interesting question that arises in some practical applications 


D N 

is: given a parametrization (A,B,C,D) (a, 3) : x £2^ ■+ R • can a 


be 


identified independently from 3? The implication here is that we are 
only interested in a and are not concerned if 3 is not identified uniquely. 
For example 3 could represent the feedback gains and Ot some open loop 
parameters. This motivates the following definition. 

Definition 3.3 

q x ^2 n 

A parametrization (A,B,C,D) (a, 3) : £2^ x R x R R 

is said to be locally partically identifiable in a at a — cl and 3=3 
if there exists an £ > 0 such that 

(i) 1 1 a. - a|| < e, ||(3. - B|| < e , i = 1 , 2 . 

(ii) D(a 1 ,B 1 ) = D(a 2 ,& 2 ) 

and (iii) Cta^B }A k (a = c (a 2 ,B 2 > A k (a 2 ,B 2 )B{a 2 ,3 2 > 

k = 0,1,2,. . . 

imply a 1 = a 2 * H 

The following Theorem gives conditions for local partial identi- 


fiability. 

Theorem 3.8 

N q i 

Let (a,b,C,D) (a,B) s x fl 2 -»• R (withfi i open subsets of R 

i = 1,2) be a C* parametrization of the system matrices (A,B,C,D) , and 
assume that (A,B,C,D) (a,B) is minimal. Suppose 

1) rank [Z ( (a, 8) , (a, 3) ) ; M^(a,B)] = r 2 for all (a,B) in some neighborhood 
of (a, 3) * 
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2) rank [z ( (a,B) , (a,B) ) , Mg(a,B), M a (a,8)] = r x + r 2 for a11 in 

A A 

some neighborhood of (a, (3) . 

Then (A,B,C,D) (a , 3 ) is locally partially identifiable in a at 

a A 

(ot,B) if and only if = q^. 

In here Z((a,8) , (a, 8)) is as defined in Theorem 3.6 and 
M a (a,B) and Mg(a,B) are derivatives of the parametrization with respect 


to a and B respectively (see Theorem 3.3) . 

The proof relies on the following lemma. 
Lemma 3.9 

m . 


m m 

Let and ^ 2 be open sets in R and R and f: x £2^ 


be a C" map with k >_ 1, thus f maps (x,y) into f (x,y) , with x e 


and 


y e Also assume 


1 ) 


rank 


9f (x,y) 
9y 


/V A 


V(x,y) in some neighborhood of (x,y) . 


2 ) 


rank 


(x,y) 

9x3y 


A A 

+ r 2 V (x,y) in some neighborhood of (x,y) . 


Then, there exists a neighborhood of (x,y) , say W , such that 

f ^ X l' y i ) = f(x 2' Y 2 ) an<3 ^i/y^ /tx 2 ,y 2* C W 
imply x^ = x 2 , if and only if = m . 

Proof of Lemma 3 ,>9 

We will now use the rank theorem (Lemma 3,1) to find f“ 1 (f(x r y)) 
for any (x,y) in some neighborhood of (x r y) . 

From Lemma 3.1 there exist neighborhoods U of x and V of y such that 
f (x,y) = u o <j> o u 1 
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for all (x,y) £ U x V, where u and u’ are C diffeomorphisms and 

*'*l’*2 ' ( VV" x r 1+ r 2 ' ° ••• 01 

and further 

f(x,y) = v~ o ip o v'~ for all y £ V and any fixed x 6 U. 

X X 

k 

where and v 1 ^ are C diffeomorphisms and 
xx 

• * X ) = (x_ fX* / • « *X t 0 » • * 0) 

1 * m 2 r 2 

Therefore 


S 1 (x / y) = f (f (x, y) ) n (U x V) 


= (UxV)flu (u f x (f (x,y) ) ,« . w u'^ +r (f (x,y) ) 9 z^, • - • f z^) 


where z 


♦ £ R for i = l,2,...k, and k = m + m - r - r . 
X -L 2 X 2. 


also f (f(x,y)) 3 S 2 (x,y) = { <x,y) | f (x,y) = f(x,y), y £ V} 

= (U X V) fl v~ 1 (v’~ (f (x,y) ) , . . . ,v’~ (f (x,y) ) ,z , . . >z f ) 
x x f i. x f r^ -l 

where z. e R, i = 1, 2, and £ = nu-r . 

1 Z ji 

Hence for all (x,y) £ U x V, 

S^(x,y) is homeomorphic to a neighborhood in R 

t 

homeomorphic to a neighborhood in R 
and S 2 (x,y) Cl S 2 (x,y) and therefore S^Cx^y) ~ S 2 (x,y) 
if and only if 


Clearly if S^(x,y) = S 2 (x,y) the implication that - x^ in the 
theorem statement holds and otherwise the implication is false since there 
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will always be ^ such that f(x^,y^) - £(* 2 ^ 2 ^ no matter how 

/s y\. gS 

small a neighborhood of (x,y) is taken. “ 

Proof of Theorem 3.8 

Define F(T,a,B) = (TA(a,B)T _1 ,TB(a,|3) ,c(a,B)T -1 ,D(a,3) ) then 

A 

the result holds if and only if there exist neighborhoods U of a and 

A 

V of 3 such that 

F(T,a, 0) = F {T ,U,"0) 

and a, a £ U and f3,j3 £ V and T,T £ GL (n) imply a = a. 

Now by an exactly analagous argument to that of Theorem 3.4 
we can restrict T to be in a neighborhood of T = I. Therefore Lemma 
3.9 applies and the result follows immediately. ® 

An application of this result is given in Corollary 3.10. 
Example 3.2 (continued) 

n = m = p = 1 and a,b,c are free parameters. Now referring to 
Theorem 3.8 r let a = a and & = (b,c) then conditions 1) and 2) become 


1) rank 


2 ) rank 


0 0 0 

b 1 0 

-c 0 1 

— I 

0 0 0 1 

b 1 0 0 

-c 0 1 0 


- 2 = r^ for all (a/b f c) £ R 


= 3 s r^ + r^ for all (a,b,c) £ R 


Thus r. " 1 = = dimension of a , and local partial identifi- 

A A a HB 

ability of a results if (a,b,c) is minimal. ■ 
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3.5 Identifiability in the Presence of Feedback 

As pointed out by Rstrom and Eykhoff (1971) identification in 
the presence of feedback can cause significant problems. Consider 
Sstrom’s example given in Figure 3.2. 


W 



Figure 3,2 


In this example a simple-minded identification algorithm v/ould 

be to ignore H , observe e and y and assume that 
F 


H 

P 


- £ 


e 


y —1 

but — = (H F > and therefore such estimates would be completely false. 

The feedback can enter quite subtly as for example an aircraft pilot's 
response to external disturbances. The correct way to model a system 
with feedback is to write down the state space equations in open-loop 
form with the feedback matrix modifying the system matrices. Then the 
identifiability questions can be asked and answered as in the previous 
sections of this chapter. 
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For the general problem when the system and feedback matrices 
are all parametrized by some unknown parameters, and it is desired to 
identify some of the parameters and not others then the conditions of 
Theorem 3.8 would have to be used. 

Two particular situations have been worked out by way of 
example in the following Corollary (a direct consequence of Theorems 
3.4 and 3.8) 

Corollary 3.10 

Consider the linear feedback system, 


dx(t) 

dt 


A(a)x(t) + B(a)u(t) 


y (t) « C(a)x(t) , u(t) = — Fx (t) + v(t) 

with F e R 111X11 where (A,B,C) (a) : R ** -*■ R n ^ n+m+ p^ i s a C' para- 
metrization of (A,B,C) . Assume 

B (a) © I 


1 ) 

2) rank 


r ew Qy 1 *| 

rank I W(a,F) ; 0 I = r 2 in a neighborhood of a. And 

B(a) ©I 


■ 

W(a,F) ; 

- 


0 ; M(a) 


J = r^ + r^ in a neighborhood of a. 


where W(a,F) = 


i©(A(a) -B(a)F)' - (A(a) -B(a)F)©i 

i © b * (a) 

- C(a) © i 


Then the parameters a and F are locally identifiable if and 
only if ^ + r 2 = nm + q. 

Further the parameters cx are locally partially identifiable if 
and only if r^ = q. Q 
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When considering systems under feedback the analysis of invariants 
has been an active area of research. (see Popov (1972) , Morse (1972) , 

Wang and Davison (1972), Wolovich and Falb (1969)). Therefore if one 
had a completely unknown system and wanted to identify as much as possible 
in the presence of an unknown state feedback matrix it would be natural 
to use a canonical form under the transformation 

(A,B,C) + (T(A - BF)T , TB, CT ) with F £ R mxn and T 8 GL(n) . However 
such an analysis seems more suited to the design rather than the identi- 
fication problem, since it is unlikely that only the invariants under 
feedback are required to be identified. 



-80- 


3*6 Comparison with the Information Matrix and Sensitivity Analysis 

Previous work on identif lability has been of two types* Firstly 
there has been work on deterministic single input systems with the (A,B) 
matrices in standard controllable form and the question answered is what 
inputs will enable the unknown parameters in the (A,C) matrices to be 
identified* (see Stanley and Yue (1970), Fisher (1965), Lee (1964)). 

Secondly there has been work in the statistics and stochastic 
control literature on the type of observation noise statistics, control 
inputs and system parametrizations that enable the system matrices to 
be identified asymptotically. This work is generally based on the so- 
called (Fisher) Information Matrix, and needs knowledge such as the 
conditional probability density of the present observations given all 
previous observations and the parameters. (see Rothenburg (1971), Tse 
(1973), Mehra (1972), Astrom and Bohlin (1966))* The information matrix 
is a quite general approach, and indeed also gives approximations of the 
covariance of the parameter estimates, however for the problem that has 
been considered in this chapter it gives computationally difficult 
tests . 

The present work is complementary to the above work in that it 
assumes the inputs and observations are sufficient to identify the trans- 
fer function and then determines the identifiability of the system para- 
metrization. The equivalence of the two approaches will now be shown 
for a particular situation. 

Example 3.8 

Consider the linear discrete time dynamical system 
x(k + 1) = Ax(k) + Bu(k) , x(0) - 0 
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z(k) = Cx(k) + w(k) 

where w(k) is a Gaussian white noise sequence with E(w(k)) = 0 and 
E(w(k)w(j) ) = R 6 . (with R = R' > 0) . 

The evolution equation for s sample points (in time) , can be 
written as 


H o 0 


u (0) w(l) 


H i H o 


u(l) + w(2) 


H i H o 


u(s-l) w(s) 


where H. = C A B 
k 


Now if (A,B,C) are parametrized as (A,B,C) (a) the equation can 


be written as. 


z = K (a) + w 
s s s 


with the obvious interpretation of the symbols. 

For such a system Schweppe (1973) shows that the information 

A 

matrix at a = a is, 




where R = I © R 
s s w 


h (a) 

c 

h (a) 

Now let h (a) = 
s 


h (a) 
s-1 
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k 

where h. (a) = l H .(a) u(j) and II (a) = C(a)A (a)B(a) 
k j-0 k 


Then 


3h^(a) k 3H k _j(a)u(j) 
3a 3a. ~ 


• n 3a - 
3=0 i 


and 


3h (a) k 3 h . 

■ l So 21 

3 .0 F 


3a 


(where H, is H, listed as a 

k k 


vector by rows) . Hence 


3h(a) 

3a 


U, 


U 


0 

1 u o 


u 


s-1 


U 


0 

1 u o 


3h 0 (a) 
3 a 

3H 1 (a) 

3a 


3h (a) 
s-l 

3a 


= u K(a) say. 


where U, = I u* (k) and hence 
k p 


m (a) = k 1 (a)u , R -1 u K(a) 

s 


Now (a) gives a lower bound on the covariance of any unbiased 

estimator of a, (Cramer-Rao lower bound*) Therefore to asymptotically 
identify ct exactly with an unbiased estimator we need that all the 

values of M g (a) tend to infinity as s + 00 . This is a condition on both 

the parametrization and the input sequence. It is clearly necessary that 

A 

K(a) must be of full rank for the information to tend to infinity as 

s 00 f in which case inputs will exist to ensure this is so. (This is 

similar to the persistant excitation required by Rstrom and Bohlin 
(1966)). The condition that the rank K(a) = q is an identical sufficient 



-83- 


condition for local identifiability as that given in Theorem 3*3, how- 
ever this condition is only necessary if rank K(a) is constant in a 
neighborhood of a. Indeed it is not necessary for identifiability that 
the Cramer-Rao lower bound tend to zero as s 00 , because biased 
estimators can sometimes improve on the Cramer-Rao lower bound. For 
example the maximum likelihood estimator (see Box and Jenkins (1970)) 
is in general biased for any fixed sample length, s, but as s -► 00 
the estimates tend to the true values. For maximum likelihood estimators 

A 

the Cramer-Rao lower bound will only be tight when (a) > 0 and when 

A A 

s -*■ 00 . If M{a) is singular for a = a but not in a neighborhood of a, 
then Cramer-Rao lower bound does not give meaningful results, since 

A 

in this case no linearization is valid near a. 

In a very similar manner to the above analysis for the noise 
free case, the sensitivity of the outputs with respect to the parameters 
given the inputs, can be produced. (see for example Kokotovic and 
Rutman (1965) ) . Identifiability will then result if the sensitivity 
of the outputs is of full rank. Such a result would depend on the 
inputs, but given that the input sequence is satisfactory, a condition 
equivalent to that of Theorem 3.3 will be obtained for the identifiability 
of a parametrization. One can also consider the sensitivity of the trans- 
fer function or Markov parameters with respect to the unknown system 
parameters, and then this would be equivalent to Theorem 3.3. 



CHAPTER 4 


I DENT IF I ABILITY FROM OUTPUT CORRELATION 

4.1 Introduction 

In this chapter we consider the identifiability of linear system 
parametrizations when the system is driven by white noise and only the 
output is observed. This is in general a significantly more difficult 
problem than when input observations are also made, and is referred to 
variously as the spectral factorization problem and the inverse problem 
of covariance generation. Before the identifiability problem can be 
approached characterizations of indistinguishable systems in these 
situations are required. Sections 4.2 and 4.3 give the appropriate 
background material for the continuous time and discrete time situations 
respectively. Then in Section 4.4 the identifiability problem is 
considered. 

4.2 Continuous Time Systems 

In this section we consider the system, 

= A x(t) + B u(t) 
y (t) = C x(t) + D u(t) 

with x(«) e R n , u(0 e R m , y(0 e R P and the following assumptions. 

Al. The input u{t) is not observed directly but is assumed to be a 
white noise process normalized such that E(u(t)u(T)) - l5(t-T). 

A2. The matrix A is asymptotically stable (i.e. the eigen values of A 
are strictly in the left half plane) . 


- 84 - 
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A3. The system has reached steady state when the observations begin 
(i.e. the output process y (t) is a stationary random process). 


A4. The system to be identified is globally minimal, i.e. the dimension 
of the state is less than or equal to that of any other system with the 
same output spectral density when driven by white noise. (Anderson 
(1969)) . 

Under these assumptions the most information that may be ob- 
tained from output observations is the output spectral density, 

$(s) “ G (s) G 1 (~s) , where G(s) = C(Is - A) ^ B + D. 

The identification problem is thus, given observations of 
$(s) find a system G(s) such that $(s) = G(s)G T (-s). This is the 
so-called spectral factorization problem. It has been extensively 
studied, and a general solution in the frequency domain has been given 
by Youla (1961) * A general time domain treatment of this problem has 
been given by Anderson (1969) . Since we are primarily concerned with 
state space representations the results of Anderson are most useful 
for our purposes, and are restated here for easy reference. 

Let Z(s) be a positive real matrix of rational functions 
such that 


$(s) = Z (s) + Z* (-s) (sum decomposition) 

Z(s) is in fact the Laplace transform of the correlation 

function R (T) for T > 0. 

YY “ 

Now let (A,G,C,J) be a minimal realization of Z(s). Then we 


have the following result. 
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Lemma 4.1 (Anderson (1969)) 

Consider the matrix equation 


AP + PA' PC’ - G 


CP - G' 


-J - J' 


[b' D'] 


in the unknown matrices P(nxn) , B (nxm) , and D(pxm) . Then every 
globally minimal solution, G(s) to $(s) = G(s)G'(-s) has a state space 
^■® a liz;ation (A,B,C,D) with B and D satisfying (ME) together with some 


P = P' > 0. 


Conversely if B ,D and P = P' >0 satisfy (ME) , 

G(s) = C(ls - A) 1 B + D is a globally minimal solution to 

$(s) = G(s)G' (-s) . ffi 

Lemma 4.1 essentially characterizes all equivalent state space 
solutions to the spectral factorization problem, and is used in the 
following corollary. 


If (A^ / c j^ ,B 2 ,C 2 ,D 2^ are minimal systems 


G l (s)G l' ( ~ s) “ G 2 (S)G 2' (_S) 

(where G^s) «= cAls - A^'V + d , i = 1,2.) 

if and only if there exists T e GL (n) and Q - Q' such that 


A 1 = T A 2 T 
c i - V 1 


QA ’ + A Q = - B B ' + TB B 'T' 

SV--VY-W 
Vi ■ Vi 
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Further if D^D^' nonsingular the above is equivalent to 


there being a similarity transformation between the Kalman filters of 


the two systems. 

Proof 

We know from global minimality and Lemma 4.1 that if 
<f) (s) = Z (s) + Z' (-s) 


with Z(s) positive real, then a minimal realization of Z(s) is given 
by {A^,G,C^, J) where 

G = P^* + B^D 1 1 

and J + J' = D^D^' • and where P = P' >0 satisfies 

W + Vi = ' B iY- 

Also Lemma 4.1 implies that there exists a unique similarity 
transformation T e GL(n) between (A 2 ,C 2 ) and A^,C^) , i.e. 


A = T A 2 T _1 , 


C 1 " C 2 T 


-1 


Therefore (A 2 ,t ^G, C 2 / J) also a realization of 


Z(s) and there exist ? 2 - P 2 > 0 suc ^ that 


( P 2 » 2 ' + A 2 P 2 - - b 2 b 2 ' 

T ' 1 0 - * 2 C 2 * W 


J + J ’ - VY 


Simple manipulation of the above equations gives that 
(P -TP T')A ' + A (P -TP T') - -B B ' + TB B 'T' 

X ^ X XX JLJL X / 



- 88 - 


G = TP 2 T* + TB 2 D 2 = PC' + B 1 D 1 ' 

Hence, setting Q = - TP 2 T' the 'only if* statement follows. 

The if statement can be verified by direct substitution. 

The equivalence of the condition with the equivalence of the 
system Kalman filters is proved as follows. (That two systems with 
equivalent Kalman filter's are indistinguishable is essentially shown 
in Geesay and Kailath (1969)). 

The Kalman filter is realized by 

- A ;<t> - K vrt) 

y(t) = c x (t) + v (t) 
where v(t) = C x(t) - y(t) 

K = (IIO + BD* ) (DD 1 ) “ 1 

and 

Ha' + aH + BB' - (nc + BD* ) (DD* ) ”" 1 (lie' + BD’)’ =0 

Now defining I = P - IT, and using Lemma 4.1 we get 

(*) ZA' + hi - (G - EC')(J + J'J^G - EC')' =0 

for which there is a unique minimal solution for 

Thus tv/o Kalman filters are equivalent if there exists 
T £ GL(n) such that 

A 1 = TA 2 T " 1 ' C i = c 2 t " 1 /• \ - TK 2 

and the covariance of = covariance of i.e. 
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Hence the only equation to prove is that = TK^. 

Consider (*) for system (1 ) , then substituting for A^ and 
and setting £ = T T ^ 1 / gives 

Z A 2 ' + A 2 Z - (T G - Z C 2 ’) (J + j') -1 (T G - Z C 2 ‘) =0 

but (*) for system (2) gives 

Z 2 A 2 - + A 2 S 2 - (T G - Z 2 C 2 ’) (J + G - E 2 C 2 ') = 0 

Hence since there is a unique minimal solution 
Z 2 = Z - t " 1 E 1 t" 1 ' 


Thus 


K. = 


(II C ' + B D '](D D ') 1 
1 1 1 1 1 1 1 


- 'Vi * W * hWi’’ 

- (G - T E 2 C 2 ')(D 2 D 2 ')- 1 
= T(T -1 G - I 2 c 2 ') (D 2 D 2 ,)_1 




= t(p 2 c 2 ' + b 2 d 2 - - Z 2 c 2 -)( d 2 d 2 ') 


»x-l 


= T K, 


as desired ♦ 

Conversely that the equivalence of the Kalman filters implies 
the existence of Q - Q 1 can be established analagously. H 

This relationship between the solutions to the spectral 
factorization problem will be used in Section 4.4 where we discuss 
the identifiability problem stated earlier* 
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4.3 Discrete Time Systems 

In this section analagous results to those of Section 4.2 are 
derived for the discrete-time case. The results presented are probably 
equivalent to other work discrete time spectral factorization (e.g. 
Mehra (1970 and 1971) and Motyka and Cadzow (1967)) but the particular 
form of the results does not seem to have appeared in the literature. 

Consider the discrete-time linear dynamical system: 


x(k+l) = A x(k) + B w(k) 
y(k) = C x (k) + D w(k) 


where w(-) is white Gaussian noise with E{w(k)w f (j)} = I 6 


kj 


How 


assume that the output spectral density, <I>(z) (= the z-transform of 

E{y (k) y ’ (k-i) } is known then the (discrete -time) spectral factorization 
problem is to find an asymptotically stable (i.e. A (A) < 1) transfer 

function G(z) such that 

$(z) = G(z)G' (z _1 ) 

Any state space realization of G(z) will then give possible values 
for the parameters (A,B,C,D) . 

Since $(z) is a spectral density matrix we can assume without 
loss of generality that: 

Al. $(z) - (z -1 ) 


A2. $(e^ ) is Hermitian nonnegative definite for — TT < 0 < IT. 

Further we will assume that, 

A3. $(z) is analytic for z = e j6 with -if < 6 < tt, i.e. $(z) has no poles 


on the unit circle * 
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From the partial fraction expansion of $(z) one can decompose 


$(z) as 


$(z) = Z(z) + 


Z' (z” 1 ) 


where the poles of Z(z) are strictly inside the unit disc. Z(z) is 
essentially the one-sided z-transform of E{y(k)y' (k-i) }. We are 
thus looking for a factorization of the form: 

(SF) $(z) = Z(z) + Z'(z -1 ) = G(z)G 1 (z _1 ) 

Assume that Z(z) has a minimal realization (A,G,C,J), then the 
following Lemma characterizes all solutions to (SF) . 


Lemma 4 . 3 

Consider the matrix equation 


(ME) ' 


APA' 

- P, 

APC ' 

- G 


B " 

CPA’ 

- G', 

-J-J' 

+ CPC' 


D 


D'] 


in the unknown matrices P(nxn) , B(nxm) , and D(pxm) . Then every 
globally minimal solution, G(z) to $(z) = G(z)G ' (z *") has a state 
space realization (A,B,C,D) with B and D satisfying (ME) ' together with 
some P = P 1 > 0 . 

Conversely if B,D and P = P* >0 satisfy (ME) ' , 


G(z) = C(Iz - A) ^ B + D is a globally minimal solution to 
$(z) - G (z) G 1 (z _1 ) . 


Proof 

2-1 

We will make the transformation s = and reduce the 

z+1 

problem to the continuous time case. First we note an observation 
about this transformation which is easily verified by direct substitution. 
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Fact 4.4 

If (A,B,C,D) is a minimal realization of G(z) {with X^ (A) < 1) 

/ "j- Cj \ 

and W(s) = then the McMillan degrees of W(s) and G(z) are 

equal and further 

(-(I-AMI+A) , /I (I+A) ” B, /2 C (I+A) ~ , D - C{I+A) B) 

is a minimal realization of W(s) . 


Conversely if (F,G,H,J) is a realization of W(s) and 
G(z) = W (~y) then 

( (I-F) _1 (I+F) , /J (I-F) _1 G, /2 H(I-F)” 1 , J + H(I-F) _1 G) 
is a minimal realization of G(z). 

Now define 


W(s) - G 


\ 1-s / 

T (s) = Z 

/ 1+s \ 

- — 

\ 1-s / 

Y(s) = 4> 

/l+s\ 

(irr) 


then 


Y(s) « T(s) + T'(-s) - W(s)W (s) 
and T(s) is positive real. 

Since G(z) is globally minimal as a solution to G(z)G f (z **) = $(z), 
W(s) is globally minimal as a solution to W(s)W (-s) = Y(sJ, Using 
Fact 4.4 T(s) has a minimal realization 

(-(I-A) (I+A) 1 , /2 (I+A) _1 G, Si C(I+A) _1 , J - C(I+A) _1 G) 


and using Lemma 4.1 W(s) will have a realization of the form 
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(-(I-A) (I+A)" 1 , y/2 (I+A) 1 B, /2 C(I+A) _1 , D - C(I+A) _1 B) 
where B and D satisfy the following equations for some P = P’ >0. 

f -(I-A) (I+A)' 1 P - P (I+A' ) _1 (I-A* ) = - 2(I+A)“ 1 B B 1 (I+A 1 ) _1 

/ P(I+A') -1 C' - /2 (I+A)” 1 G = - /J (I+A)” 1 B (D ' - B'd+A')* 1 C ' ) 

; -J + C (I+A) —1 G - J' + G ' ( I+A ' ) _1 C ' = - (D-C(I+A) X B) (D-C (I+A) 1 B) ' 

Now since G(z) = W (“y) and using Fact 4.4, G(z) will have 

a realization (A,B,C,D) satisfying the above equations. Straight 
forward manipulation then gives the result. 

The converse is easily established by direct substitution. 

1 

Lemma 4.3 shows that if the sum decomposition can be identified then 
the spectral factor satisfies a relatively simple matrix equation. 

The following corollary uses Lemma 4.3 to derive a relationship 
between the solutions to the spectral factorization problem. 

Corollary 4.5 

If (A ,B ,C ,D 1 and (A ,B ,C ,D ) are globally minimal discrete 

1111 At At a. At 


time systems then 

G 1 (z)G 1 (z* 1 ) = G 2 (z)G 2 (z -1 ) 

(where G. (z) = C. (Is-A.) B. + D. r i = 1,2,) if and only if there 
x x 1 i 1 


exists T e GL(n) and Q = Q' such that 

r 1 

-i 


A^ = TA 2 T 


C = C T 
1 2 

A X QA ' - Q - ~ B^' + TB 2 B 2 'T- 

W = TB 2 D 2 ' ' E l D i 
C l SC i = D 2 D 2 ' ' D l°i 


) 
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Fur the r if D^D^ 1 nonsingular the above conditions are equivalent to 

there being a similarity transformation between the Kalman filter of 
the two systems* 

Proof 

The proof is analagous to that of Corollary 4*2, The equivalence 
of the Kalman filters is also shown by Tse and Weinert (1973) by 
different techniques. G9 


Comments on the Correlation Identification Technique due to Mehra (1971) . 


To illustrate how the previous results can be applied, the 
correlation technique of Mehra (1971) is now considered by way of 
example . 


This algorithm estimates the system parameters from estimates 
of output correlation function. 


C i = E{y (k)y 1 (k-i) } 



1 

(N-i) 



y (k)y ' (k-i) 


Now 

00 00 

$(z) = y c.z + y c. 'z* + c 

J 1 X 1 0 

= 2(z) + Z'(z (see Lemma 4.3) 

= C(Iz-A) _1 B + J + G'dz" 1 - A’) -1 C' + J' 


= J CA 1_1 G z~ X + y G'A' 1-1 C' z 1 + J + J' 

1 1 


(A,G,C,J) can be a realization of the sum decomposition, Z(z), of $(z) 
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by Lemma 4*3. Further the C. will now be estimates of the Markov 
J l 

parameters of Z(z)* Hence the matrices A and C can be estimated using 
a standard realization algorithm (e.g. Ho and Kalman (1967)). Finally/ 
the B and D matrices will be the solution to the algebraic equations 
given in Lemma 4.3* 

In order to obtain a unique solution for (A,C),(A,C) could 
be put in a canonical form such as those of Chapter 2, but still B 
and D will not be uniquely determined even if a minimum phase assumption 
is made (unless m^p-l) * This problem is pointed out by Mehra (1971) 
where he suggests identifying the Kalman filter instead, whose transfer 
function will indeed be identifiable by Corollary 4.5. The identi- 
fiability problem will now be discussed in more detail in the following 
section. 

4.4 Identifiability from Output Observation 

The question considered in this section is when a parametrization 
is identifiable from output observation alone. 

Definition 4.2 

Let (a,B,C,D) ( a) : ftcR ^ + R n(n+ra+p)+mp a parametrization 

of the system matrices (A,B,C,D). This parametrization is said to be 

Ai 

locally identifiable from its output spectral density at a = a £ ft 

if there exists an e > 0 such that 

(i) | |a - aj | < e, | |B - ct| | < e, a,B e 

and (ii a) (continuous time)/ 

G(s,a)G* (-s,a) = G(s,g)G l (-s,$) for all s £ C 

(ii b) (discrete time) 

G(z,a)G* (z -1 ,a) = G(z,B)G' (z -1 ,B) for all s e(E . 
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imp ly a « 3 

(where G(s,ot) - C (a) (Is-A(a) ) ^ B(a) + D(a). B 

A condition for local identifiability in this sense can be obtained 
via the characterizations of all globally minimal solutions to the 
spectral factorization problem given in Corollaries 4.2 and 4.5. Thus 
local identifiability from the output spectral density is implied if 
the following equations have a unique solution a=B,T=I,P=0 
for all a, 6 e N £ (a) . 

1. Continuous Time 

e = Q\ a (a) = ta(3)t - 1 , c(a) = c(3)t _;l 
A( a)Q + QA’ (a) = - B(a)B'(a) + t b(3)b'(B)T' 
q c'(a) - - B(a)D'(a) + t b<0)d' (3) 

D(a)D' (a) = d($)d’ (3) 

2. Discrete Time 

Q = Q' , A (a) = T A(B)T _1 , c(a) = C(B)T -1 

a{«)q a ' (a) - q = - B(a)n'(a) + t b<3)b , (3)t' 

‘ Q C \-- i ■ '.v./ f 1 \UJ i l 1 

C (a)Q C' (a) = - D(a)D'(a) + d(3)d ! (3) 

The following theorem can be proved in an analagous manner to Theorem 3.4. 
Theorem 4.6 

Set (A,B,C,D) (a) : (2CI R*^ -*■ R n (n+m+p) +mp q an Q p en set 

in R **) be a C 1 parametrization of the system matrices (A,B,C,D) of 
continuous time system satisfying (Al) - (A4) of Section 4.2. Then 
this parametrization is locally identifiable from its output spectral 
density at a e ft, if the following linear equations in (63 ,6 b,6d,6t,6q) , 
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have a unique solution (i.e. zero) . 

(i) 6Q “ 6Q’ 

(ii) (a6q + 5bb* - 6tbb' ) + (a6q + 6bb’ - 6tbb') ' = 0 

(iii) 6 qc’ = - 6bd' - b6d’ + 6tbd’ 

(iv) Sdd* + d6d* = 0 


(v) 


6ta - a6t 


6b 


- c6t 
6d 


= M(a)63 


where M(a) is defined in Theorem 3.3, and (A,B,C,D) - (A ,B,C ,D) (a) . 
The analagous equations for discrete time systems are 


(i) ’ 

6q = 6q« 


(ii) ' 

a6qa’ - 6q = - 

A ^ 

6bb* - b6b' + 6tbb' + bb'6t’ 

(iii) 1 

aSqc’ = - 6bd' 

- b6d' + 6tbd' 

(iv) 1 

c6qc' = -6dd’ 

- d6d* 

(v) • 

as (v) above- 

The above condition is equivalent to a nonzero determinant 


condition of dimension [— ■ (3n + 2m + 1) + pm] . 

Notice that although the theorem uses implicity the matrices 
P,G,J of Lemmas 4.1 and 4.3 only the nominal values of the system 
matrices (A,B,C,D) are required. 

In general fewer parameters can be identified than when input 
observations are allowed. In fact the number of identifiable parameters 
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is bounded by [2np + J , which if m = p is £i£_ less than 

the [2np + mp] identifiable parameter when input observations are 
permitted. 



CHAPTER 5 


• GEOMETRICAL PROPERTIES OF MINIMAL SYSTEMS 

. t 

5.1 Introduction 

In this chapter we examine some geometrical properties of minimal 
linear systems which are of interest in identification and also in their 
own right. 

As mentioned in Chapters 2 and 3 many useful parametrizations 
admit multiple representations of nonminimal systems. This implies that 
in such cases if the system being identified with such a parametrization 
is not minimal then the identification problem no longer has a unique 
solution, and many minimization algorithms will become ill-posed. Now 
in on-line algorithms where new estimates of the unknown parameters are 
made after each new data point, a cost function is essentially minimized 
at each point in time and there is no reason to suppose that after re- 
latively few data points the estimates will represent minimal systems. 
Whether estimates become nonminimal or nearly nonminimal depends on the 
nature of the set of nonminimal systems in the parameter space. 

In this chapter the following problem is considered, "given 
a parametrization of a linear system which may represent both minimal 
and nonminimal systems does the set of nonminimal systems separate the 
minimal systems into unconnected regions?" For single input/single 

output systems the natural parametrization of the standard controllable 

+ , 

I would like to acknowledge that many of the original ideas for 

this chapter are due to Professor R.W, Brockett of Harvard University. 
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(or observable) form is considered and indeed the minimal systems do 
not form a connected subset of the parameter space* However for multi- 
input/mult i-output systems there is no "natural” parametrization and 
the problem is more complex, but in general it would seem that the mini- 
mal systems form a connected subset of the parameter space , and this is 
proven for certain examples* 
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5 . 2 Single Input/Single Output. Systems 

The following unpublished result of Brockett (private communi- 
cation) shows that for single input/single output systems the set of 
minimal systems does not form a connected subset of the parameter space. 
The proof that the space is disconnected is the same as that of Brockett, 
but the proof that each region is connected is new. 

Theorem 5.1 (Brockett) 

Given the rational function 

9 Z s n + a s' 1 " 1 + ... + a 
n-1 0 

2 n 

Then the parameter space, R , is divided into (n+1) connected 
regions in which there are no pole/zero cancellations* Each such region 
is characterized by the Cauchy index of g(z) (or the signature of the 
corresponding Hankel matrix) and these disconnected regions are separated 
by rational functions of lower order. 

00 

Note: Cauchy index of g(z) = (g(z)) 

= (number of times g(z) changes from 
to + 00 ) - (number of times g(z) 
changes from + 00 to -°°) as z goes 
from - 00 to 00 on the real line. 

If S = S* then the signature of s, 0( S)= (number of positive eigen 

values of S) - (number 
of negative eigen 


values of S) . 
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Proof 

1) The regions are separated 

Let (A(a) ,h,c' ( B) ) be the standard observable realization of 
g(z) and let 

s i (a r 3) = c- 1 (3) A 1 (a)b 

Define 



L S r-l S 2r-2J 

There are no pole^/ zero cancellations if and only if 

det S n (a ,$) f 0. 

Now it is shown in Gantmacher (1959) that the Cauchy index of 
g(z) = signature of and that if two symmetric matrices have different 
signatures then every continuous path in the space of symmetric matrices 
that connects them passes through a singular matrix* Therefore it is 
not possible to continuously connect two rational functions with different 
Cauchy indices without passing through a pole /zero cancellation. (This 
is clear since if the signature changes /the eigen value must change 
sign and if the path is continuous it must pass through zero (since the 
matrix is symmetric) when the matrix is singular.) 

2) Each region is connected 

To prove connectedness we must exhibit a continuous path con- 


necting any two rational functions with the same Cauchy index and degree. 
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Firstly we show that it is sufficient to find a path in the space 

Sw i = l r , 2n-l. (Lemma 5.2). Then it is shown that any Hankel 

matrix with a particular signature can be continuously deformed into 
a standard form without its determinant becoming zero. (Lemma 5.3).^ 
Lemma 5.2 

For any continuous function s^(t) : [0 , 1] 1 

i “ 0 # ...,2n-l and such that det S (t) ^ 0, there exist continuous 

n 

functions a_^ (t) : [0 , 1} + R * and 3^(t) : [0 , 1] -+ R ^ i - l,2,.«. # n 

such that for each t 

£ (t) n_1 + ... + B.(t) s (t) s (t) s (t) 

n 0 _0 , 1 2n-l 

n n-1 . 

z + a , (t)z + . . . + a rt (t) z 
n-1 0 


Where S 2n+i (t) = ' \ a n-g (t)S 2n+i-g (t> 

g=l 

is defined in Theorem 5.1. 

Proof (See Gantmacher Vol. II, page 207) 
If det S (t) jt 0 then 


a (t) 


s (t) 

0 


n 

a 1 (t) 

“1 , , 

W * 1 

• 

’ S n (t) 

* 

• 

• 

• 



a (t) 

n-1 


s 2n-l (t> 




2n 


00 s_ , . (t) 

* l 2n+l 


i-0 z 


2n+i+l 


i = 0,1 , . . . and S (t) 
' n 
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s (t) 0 . . . 0 


1 

e n-2 !t) 
« 

, 

s i (t) s o^ t5 0 • 0 

• 

> * * , 


a (t) 

n-1 

• 

• 


• , , * • 
0 


• 

♦ 

y« _ 


_ S n-l (t) ‘ ‘ S l (t) S 0 (t ^ 


_“o (t) _ 


and the result follows immediately, | 

Lemma 5.3 

Any Hankel matrix of a particular signature and rank,n, can 
be continuously deformed into a standard form without reducing its rank. 
Proof (outline) 

The proof defines some standard forms for S , one associated 

n 

with each possible signature, and then perturbs an arbitrarily S of a 

P ar ticular signature and R rank n into the standard form without allowing 
rank < n, as follows* 

1) Perturb very slightly so that D r / 0 for r = l,2,.,,n, 

(see proof for definition of D ). 

r 

2) Perturb S r so that IdJ =1 r * l,2,..,n. 

3) Perturb such that the order of the + l's and -l's in the 
sequence (1 ,D^ ,0^ , . » ,0^) is in a standard form. 

Jacobi's Theorem which determines the signature from the number of sign 

changes in the sequence (1,D, ,...D ) is used. 

1 n 

Proof 

D « det S 
r r 


Let 


i* j * * t 


r-1 


S = 
r 


r-1 


2r-2 
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Then there are (n+1) possible signatures for S n namely, n,n-2 , . . . ,-n+2,-n. 
Consider now the following standard forms, S^ n ^ v = 0,...,n given by, 

S 2j+1 “ 0 1 = 


and 


j = 0,1,.. n-1 is implicitly defined by. 


and 


.0^ = (-1) J j = 0,1,... V 
= (-1) V j = V + l,...n. 


This does indeed define s , since using the formula for the 
determinant of a partitioned matrix we get. 


D _ - D / s* {s . . . s -i)S 
r+1 r I 2r r 2r-l r 


-1 


2r-l; 


therefore s^„ = ^ + (s r . . . S 2r -i )s r 1 / S r \ 


2r D 


\ S 2r-l / 


and since s ^ = 0, s ^ is fixed by the above formula as a function 


Of D x , D 2 , .. Dr+1< 


We will now show that there exist continuous deformations of 

(s„,...,s* „) such that D. are as given above and s . - 0, j = 0,lf;n-2. 

0 2n-2 1 

Firstly consider (1,D ,D , . . ,D ), we know that D ^ 0 by assumption. 

1 2 n n 

If any = 0 i = l,..,n-l then vary (s Q ,s , . . *s 2n _ 2 ) continuously so 
that the new values (s*,s*, — S 2n-2^ satlsf y D * ^ 0 r = 0,...,n, 
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and in the process of variation no non-zero becomes zero. Such a 

variation is always possible since (to quote from Gantmacher page 354 
Vol I), in the space of parameters (s^,s^ , . . . » s -, n _ 2 ^ 311 equation of 

the form = 0 determines a. certain algebraic hypersurf ace . If a 
point lies in some such hypersurface , then it can always be approximated 
a ^kitrarily closely by points not in these hypersurfaces. Then since 
the rank does not change its signature does not change, and hence the 
signature is given by Jacobi’s Theorem (Gantmacher Vol. I, p. 303), 

G(S n ) = n - 2V(l,D 1 ,D 2 ,..D n ) 

Let v = ) = number of variations of sign in the 

sequence l^,...,^. Now since D^* f 0 for r = 0,1, ...n, we can make 
the following perturbation. 


1 ) 

2) 

3) 


I> r (t) = (1-t) + t sgn D^* r = 0,...n 

^2r+l (t) = (1 ~ t} S 2r+1 r = °'l' • • • »n-l. 
s ,_{t) - r — (t) + (i (t)....s .(t))s x (t) 

D (t) r 2r " 1 r 

r 


/ 


s r (tt 



for r = l,...,n-l and S Q (t) = D^tt). 

The third equation is well-posed since s^tt) only depends on 
s i (t) for i < 2r and D^t) for i = l,...,r+l and D^Ct) ^ 0 for all t. 
Also equation 3) is consistent with 1) . 
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Let D (1) = D r - 1 ,2 , . . .n and s (1) = s r - 0,..,2n-2. 

r r r r 

I ^ A 

Thus D= 1 r = l,...n and tl - 0 r = 0,1,.... n-2. 

1 r 1 2r+l 

A 

This is in the standard form except that the signs of are not 

necessarily in the desired order, and a perturbation to change the signs 
is now given. Consider the following partition of s r+ 2 


s 0 ®1 * 

• « 


m ♦ 

* S n 

r-1 

1 S r 
| 

s r+l 

S 1 

* 

• 

• 

• 

• 

• 

• 

1 

1 

1 


• 

• 

♦ 

♦ 

S r-1 




♦ 

S 2r-2 

1 

1 . 

i _ 2 ri 

S 2r 

s 

r 




S 2r-1 

1 s 2 
1 2t 

S 2r+1 

S r+1 




S 2r 

| S 2r+1 

S 2r+2 


Using the formula of the determinant of a partitioned matrix assuming 


t* 0 it is easy to verify that 


D r+2 D r+1 


-1 


S 2r+2 “ {S r+l’ * * S 2r ) S r / S r+1 


2r 


- D 


S 2r+1 " (S r ’ S 2r-1 ) S r 1 / S r+l\ 


2r 


/ 


and 
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D = D 
r+1 r 


s - (s . . . .s„ , )S 

2r r 2r-l' r 


-1 


/• r ^ 

\ s 2r-l/ 


Now assume sgn D ** — sgn 6 and Id I = Id I = i 
r r+2 1 r 1 1 r+2 1 

Let s o*'* s 2r-l be f;i - xed ' this implies D is fixed then s^ can 
v 1 x 2r 

be continuously varied so that D becomes - 6 , , and as D varies 

x+l r+1 r+1 

we can vary s 2r+2 and continuously so that D r+2 remains fixed. 

(Since sgn D = - sgn D and D ^ 0) Note that if sgn D = sgn D , 

1 * r r r 3 r+2 

Sgn °r+l cannot change without making D r+2 - 0 at some point. 


Using the above continuous deformation, if the sequence 
1/D l ,D 2 , ” D n contains the triple 1,1,-1 it can be continuously deformed 

to l,-l,-l or a sequence -1,-1,1 can be changed to -1,1,1. That is the 
variation in sign can be moved one place to the left. 

Therefore in summary we have continuously deformed S to S 

n n 

such that |D i | =1 for i = l,..n and then we can "concentrate" all 
the changes in sign in the sequence (l,^,..^) at the left and hence 
have the standard form given initially. | 


Implications of this result are given in Section 5.4. 
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5.3 Multi-input/Multi-output Systems 

For multi- input/multi -output systems the questions one would 
like to answer are "when is the set of minimal systems a connected 
subset of parameter space?" and "when is the set of nonminimal systems 
of codimension 1?" The codimension of a p-dimensional hypersurface in 
R n is r.-p. Hence a surface with codimension greater than 1 cannot 
separate any points, whereas a surface of codimension 1 can form a 
barrier. 

One method of approach is to consider the codimension of the 
nonminimal systems as follows. Suppose (A(a) ,B (a) ,C(a) ) is an affine 
parametrization of the system matrices then the set of nonminimal systems 
is given by, 

N = { a | rank [B(a) , A(a)B (a) , . . . ,A n ~ 1 (a)B(a) ] < n } 

u{ a | rank [c' (a) ,a' (a)c ' (a) , . . . ,A ,n-1 (a) c’ (a) ] < n} 

That is the class of nonminimal systems is the union of the set of 
uncontrollable and unobservable systems. Referring back to Chapter 2 
let 

k -1 k -1 

P (A,B,K ) - [b ,Ab , . . . ,A 1 b, ,...,b ,Ab ,...A m b ] 
lx 1 m m m 

m 

where K = (k ,k n ,...,k ) and V k, - n with k. >0. 

12 m . 1 i — 

i=l 

Therefore the pair (A(a),B(a)) is uncontrollable If and only if 
det (A(a) ,B(a) ,K) = 0 for all k. The solution of each equation such 
as this will be an algebraic variety of codimension > 1. Now if two 
such surfaces can be found which are independent then their intersection 
will be of codimension >_ 2 and hence the set of uncontrollable systems 
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has codimension >_ 2 . For two such independent surfaces to exist it 
is clearly necessary that n > 1 and (A(a) ,B(oc)) is not controllable for 
some a. This is a completely general approach but verifying that two 
such surfaces are independent is not easy for any arbitrary parametrization 
The following theorem considers some particular parametrizations by 
explicitly constructing paths connecting arbitrary minimal systems. 

Theorem 5.2 

Systems that are controllable and observable form a connected 
subset of the parameter space for the following parametrizations of 
the (A,B,C) matrices. 

(i) A,B, and C arbitrary matrices with m > 1, p > 1. 

(ii) C arbitrary and (A,B) given by any affine parametrization 
(A(a) ,B(a) ) (a e R q ) such that, 

(a) (A (a) ,B(a) ) is controllable for all a £ R 

(b) A (a) is such that there exists C £ R^ xn such that 
(A (a) ,c) is observable for all a e R^, and 

(c) there exists c £ R lxn and a e R q such that (A (a) ,c) 
is observable. 

[For example all the controllable canonical parametrizations 
given in Chapter 2 with p > m satisfy (a) , (b) and (c) above] 

Proof 

<iil ^e will construct a perturbation of ar arbitrary (A (a) ,B(a) ,C) 
that is minimal to a standard form namely (A(0) ,B(0) ,C) , as follows. 

(1) perturb C and a very slightly to C 1 and a 1 so that 

(A (a ) ,c^ ) is observable. This is possible by the same 
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arguments as in the proof of Theorem 5.1 and by assumption (c) . 

A A 

(2) perturb c_,...,c to c„,...,c . 

2 p 2 p 

2 2 ^ 

(3) perturb to and a to a arbitrarily close to 

1 2 2 2 
and a such that (A (a ) ,c„ ) is observable. Make c_ 


A 2 

sufficiently close to c 2 so that |A(a ), 


is observable for c 2 = Xc 2 + (1-X)t 2 and X £ [0 , 1] 


(4) perturb to c^. 

(5) ’ • perturb c 2 to c 2 by c 2 = X Z 2 + (1-X)c 2 2 , X e [0 ,1]. 

(6) perturb a to zero. 

It is clear that during this perturbation observability is preserved. 
(i) w e will construct a continuous perturbation of any (A,B,C) which 


takes (A,B,C) while preserving minimality to a particular parametrization 
that satisfies the assumption of (ii) . Assume that p > m (if not 
consider (A'jC'jB') in exactly the same way) we will perturb an arbi- 
trary system to the canonical form of Theorem 2.4 for some < = (k ,k , . . ,k ) . 

12 m 

We know there exists K such that det (P (A,B,<) ) ? 0 by controllability 
assumption. 


(1) Perturb (A,B) very slightly to (A 1 , B 1 ) so that 

A A 

det (P ^ 0 for some K. ^ K . 
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(2) By Theorem 2.4 there exists a similarity transformation, T, 

1 1 2 2 2 2 
between (A ,B ) and (A ,B ) where (A ,B ) are in the 

canonical form associated with K. If det T > 0 then I 

and T can be continuously connected in GL(n) , so perturb 

11 2 2 2 

(A ,E ,C) to (A ,B ,C ) continuously by a sequence of 
similarity transformations in GL(n) . If det T < 0 then let 

s and P ertur k continuously in the same way to 

“1 2 -1 2 

(S A S, S B ,C s) , which will be in a canonical para- 
metrization similar to Theorem 2.4. 

(3) Perturb (A,B,C) in the canonical parametrization avoiding 
unobservable systems as in (ii) above until 

det (P (A,B,K) ) > o. 

(4) Take similarity transformation as in (2) (which necessarily 
has positive determinant) to obtain (A,B,C) in the canonical 
parametrization for k. 
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5.4 Implications of Theorems 5.1 and 5.2 

Theorem 5.1 has implications in the following situations for 
single input/single output systems. 

1) Any on line algorithm where minimality of the estimates is required 
if the algorithm is to be well-posed may have problems of the type 
mentioned in Section 5.1. Namely if the initial data implies that the 
system is in the wrong region of the parameter space then the only way 
that the successive estimates can tend to the correct solution is for 
the parameter estimates to pass through a surface of nonminimal systems , , 
when the algorithm will become ill-posed. A simulation example of this 
type is given in Section 5.5. 

2) Consider the following adaptive stochastic regulator. The output 
is passed through a Kalman filter to estimate the state and the input 
is obtained by the solution of the infinite-time Ricatti equation. 

Further the gains in the Kalman filter and the solution of the Ricatti 
equation are based on the present best estimates of the system parameters 
(sometimes this is referred to as open-loop feedback) . Now if 
stabilizability is lost then the Ricatti equation's solution becomes 
infinite and if detectability is lost the Kalman filter gains become 
infinite. Such an algorithm will thus become ill-posed when detect- 
ability or stabilizability are lost, moreover the surfaces of undetect- 
able (unstabilizable) systems has local codimension 1 for the standard 
controllable (observable) form, and thus such surfaces are likely to 

be encountered as the algorithm progresses if the system is unstable and 
the initial data estimates the system to be on the wrong side of a non- 


minimal surface of codimension 1. 
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Theorem 5.2 implies that problems such as those that are 
outlined above are unlikely to occur for multi-input/multi' -output 

systems because the .set of nonminimal systems forms a surface 
of codimension greater than 1 and if it. is encountered one can rea- 
sonably blame this on "bad luck" rather than on almost inevitable 
consequence of bad initial data as with the single input/single output 
case. However for particular parametrizations difficulties may occur, 
-example if the Hankel matrix is symmetric for all ft. 

5.5 Simulation Results 

We now present the results of a computer simulation of an identi- 
fication algorithm which exhibits difficulty due to the phenomena mentioned 
in the preceeding sections of this chapter. The algorithm chosen for 
this simulation is the output correlation method due to Mehra (1971) 
that has been discussed in Chapter 4 , for single input/single output 
systems. In order to illustrate the difficulties it was only necessary 
to estimate the A matrix when (A,C) is in standard observable form. 

The system simulated was, 

/x 1 (k+k)\ To l] fx (k)l To - 

+ u(k) 

\ X 2 ( k + 2 )/ 1 % L X 2 (k) J L 1 , 


y(k) = [i 


0 ] 


*i (l °" 

* 2 oo 


+ d v(k) 


where u and v are independent Gaussian white noise sequences with unit 

covariance. The steady state was reached before any observations were 
used. 

Estimates of the output correlation function, C. = E{y (k) y (k+j ) } 


are given by 
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~ N 1 v 

C = — l y (k)y(k+j) 

3 k=l 

Then the coefficients in the A matrix are estimated by. 


~ N 
a 

0 

II 

t N 
C 1 

* N 
C 2 

-1 

- N 
C 3 

~ N 
a i 


■2 N 
C 2 

s'_ 


1 

- <U _I 


Now the indicated inverse will only exist if the corresponding 
system is second order. However the inverse will not necessarily exist 
for the estimated correlation coefficients , indeed if for the true 
correlation coefficients the above determinant is negative and the initial 
estimates of the determinant are positive then as better estimates are 
made the determinant must pass through zero and the parameter estimates 
in this region will become arbitrarily large. This behaviour was indeed 
manifested in several examples one of which is now given. 

True parameter values = - 0.24, - 1.0 d = 0.1. With 

these parameter values the pulse transfer function , 

g(z) = - — . — - — . Typical sample paths are given in Figures 5.1, 

* (z-0.4) (z-0.6) 

5.2 and, 5 . 3 which shows the difficulties encountered. Further if the 
matrix inverse is evaluated by some recursive scheme then large numerical 
errors may accumulate if the determinant becomes very small. 

If however the alternate method suggested by Mehra (1971) given by, 
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does not have these difficulties because the surface 

{A e R / m > n | det(A'A) = 0} has codimension >_ 2 and hence 
cannot separate any regions of the parameter space. 
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5.5 Other Geometrical Properties 

If one could parametrize linear systems so that all the systems 
of a given order were represented but none of a lower order were repre- 
sented, then all the problems outlined previously would be avoided. For 
the single input/single output case one would require (n+1) parametri- 
zations one for each of the regions of Theorem 5.1. Referring to th€: 
proof of Lemma 5.3 one can see that each region is characterized by the 
number of changes in sign of the principal minors of the Hankel matrix. 

regions are easily parametrized, that is those with a positive 
definite Hankel matrix and a negative definite Hankel matrix, in which 
cases all the principal minors are always nonzero. However the other 
^■®9ions correspond to some collection of minors which may be positive, 
negative or zero. This parametrization problem seems very difficult 
and has not been solved except when m = 2. One observation which perhaps 
illustrates why this parametrization is difficult is that when n > 3 
it is possible for two complex conjugate poles to be cancelled by two 
zeros and such a cancellation occurs on nonminimal surface of co- 
dimension 2. This means that some regions in parameter space will have as 
their boundaries nonminimal surfaces of codimension 1 and be "punctured" 
by surfaces of codimension 2 . 

The parametrization problem for multi-input or multi-output 
minimal systems is even more complex. 


CHAPTER 6 


CONCLUSIONS 

It is hoped that this thesis has pointed out the importance in 
identification of some structural properties of linear systems, parti- 
cularly the parametrization of linear systems. We would like to conclude 
by stating the practical implication of the results in this research by 
way of some specific suggestions concerning identification. 

1) If it is not required to have a physical interpretation of 
the state space realization of a particular system, then standard linear 
system parametrizations are appropriate. The discussion of Chapter 2 
suggests that for multivariable systems true canonical forms (e.g. Popov’s) 
are not desirable because of numerical difficulties near boundary points. 
An alternate family of globally identifiable parametrizations is then 
given that are well-suited for identification. 

2) If a natural parametrization of the system matrices is 
given by physical considerations, then before any identification is 
attempted it is recommended that at least the local identifi ability of 

the parametrization is checked at nominal values of the unknown parameters. 
Then if the parametrization is found not to be identifiable it is 
straightforward from Theorem 3.4 to see which parameters need to be 
fixed at their nominal values in order to make the remainder identifiable. 

3) If feedback is present around a system then it is suggested 
that the system equations are rewritten as a linear system without feed- 
back, but with the feedback matrix parametrizing the A matrix. Then the 
local or partial identifiability results can be used to determine whether 
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the unknown parameters in the open-loop system can be identified, as well 
as or independently from the feedback system. 

4) If a system is driven by an unobserved white noise process 
then special care should be exercised in choosing the parametrization and 
the results of Chapter 4 used. 

5) If minimality of successive system estimates is required 
for an on-line algorithm to be well-posed, then (at least for single input/ 
single output systems and for certain multivariable parametrizations) , 
on-line identification should only proceed from initially good parameter 
estimates since the set of non-minimal systems forms a surface of co- 
dimension one in the parameter space. Further it is recommended that 
algorithms should not require the successive system estimates to be 
minimal so that these problems will not occur. 

A general procedure for determining identifiability that can be 
applied to other situations is as follows. Firstly characterize by a 
set of equations all systems indistinguishable from one another given the 
observations, then determine whether the set of equations has a unique 
solution when the systems are restricted to be in some parametrization. 

A local result can then be obtained by linearizing the equation. 

Open problems that have originated from this research are: 

1) Finding good sufficient conditions for the global 
identifiability of an arbitrary parametrization. 

2) Finding globally identifiable parametrizations. 

3) Determining whether a parametrization will be troubled 
by local minima. This will depend both on the cost function 
being minimized and the parametrization. 

4) Parametrizing linear systems driven by unobserved 
white noise, in a similar way to the results of Chapter 2 for 
input/output systems. 
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5) Parametrizing minimal linear systems, i.e. finding 
families of globally identifiable parametrizations that do 
not admit multiple representations of nonminimal systems but 
represent every minimal system. 

6) Further study of the mathematical structure of the 
class of linear systems modulo equivalence, to give greater 
insight into the nature of the object one is trying to identify. 
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APPENDIX I 


KRONECKER PRODUCTS 

In Chapter 3 It is often required to rewrite linear equations 
in an unknown matrix as an equation in an unknown vector so that rank 
conditions can be written explicitly. In this section the necessary 
background is given (see also Halmos (195B) and Pease (1965)), 
Consider the matrix equation 
(*) A X B = C 

in the unknown matrix X(nxm), with A(pxn), B(mxr) and C(pxr) known 
matrices. This is a linear equation in X and thus if X is rewritten 
as a vector by some lexicographical ordering then (*) can be written 
as a vector equation. Two natural orderings of the elements of X 
come to mind, firstly to list X row by row and secondly column by 
column. The first ordering has been chosen here arbitrarily. 


Let X 1 - [x. x. . . . x ] with x. e R m and define X e R™ 
1 z n l 


as the vector, 


X’ = [x 1 x ’ ... x ’] 
1 z n 


Also let C' - [c_ c_ ... c ] with c. £ R r and C* = [c_ 1 c ' ... c *] 

12 p i 1 2 p 


Now equation (*) gives for i = 


n 


* s= y a, , x. 1 B 
D=1 J J 



B 

B 

B 
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and hence 

X = C 

Now the Kronecker product of two matrices A and B, denoted A x B, 
is defined as, 

A © B = 




Therefore equation (*) can be written concisely as the vector equation. 


(A O B*> X = C 



(B 1 © A) X = C 


to be 
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